Logo ČVUT
Loading...
ČESKÉ VYSOKÉ UČENÍ TECHNICKÉ V PRAZE
STUDIJNÍ PLÁNY
2011/2012

Speech processing

Přihlášení do KOSu pro zápis předmětu Zobrazit rozvrh
Kód Zakončení Kredity Rozsah Jazyk výuky
AE2M31ZRE Z,ZK 6 2+2c
Přednášející:
Petr Pollák (gar.)
Cvičící:
Petr Pollák (gar.)
Předmět zajišťuje:
katedra teorie obvodů
Anotace:

The subject is devoted to basis of speech processing addressed to students of master program with special focus on multimedia applications. Discussed speech technology is currently applied in many systems in different fields (e.g. information dialogue systems, voice controlled devices, dictation systems or transcription of audio-video recordings, support for language teaching, etc.). Further information can be found at http://noel.feld.cvut.cz/vyu/a2m31zre and at http://moodle.kme.feld.cvut.cz

Požadavky:

Bases of digital signal processing are supposed as preliminary knowledge.

Osnova přednášek:

1. Introduction - speech signal (digital form), speech production model

2. Basic characteristics of speech signal, phonetic and articulatory aspects

3. Spectral characteristics of speech signal (DFT and LPC spectrum)

4. Noise suppression in speech signal (additive and convolution noise, one-channel, multi-channel)

5. Hearing aids and cochlear implants (anatomy and hearing model, speech processing)

6. Principles of speech recognition, basic tasks ad applications

7. Feature extraction for speech recognition

8. Small vocabulary speech recognition based on DTW and HMM (HTK)

9. Dictation and transcription systems (large vocabulary speech recognition)

10. Speaker verification and identification.

11. Speech synthesis - basic principles (concatenative and formant synthesis, PSOLA)

12. Audio-visual speech recognition

13. Multimedia systems with voice input (dialog systems, logopaedy, language teaching)

14. Language recognition. Reserve.

Osnova cvičení:

1. Introduction: speech signal, tools for analysis, sources of speech signals

2. Basic time-domain characteristics: energy, intensity, zero-crossing, fundamental frequency

3. Spectral characteristics: short-time DFT and LPC spectrum, spectrogram

4. Suppression of additive noise in speech signal

5. Convolutory noise suppression

6. Speech processing for hearing aids and cochlear implants

7. Cepstrum and cepstral distance: voice activity detection, features for recognition

8. DTW based recognition: simple recognizer of particular words

9. HMM based recognition: basic tasks and demonstration of HMM modelling

10. Speaker verification based on GMM

11. Speech synthesis: implementation of formant synthesis, demonstration of available tools

12. Semester work presentations

13. Semester work presentations

14. Reserve. Credits

Cíle studia:

The goals of the subject is to introduce used speech technology in the most important multimedia applications. Students should manage the knowledge as basic characteristics of speech signal, speech enhancement, speech recognition, speech synthesis, audio-visual speech processing, etc. Students will practice basic tasks of speech processing in MATLAB environment and also other publicly available tools for speech analysis will be used. As a homework, students will elaborate semester project which will be presented at the exercise according to planned schedule.

Studijní materiály:

[1] Huang, X. - Acero, A. - Hon, H.-W.: Spoken Language Processing. Prentice Hall 2001.

Poznámka:
Rozvrh na zimní semestr 2011/2012:
Rozvrh není připraven
Rozvrh na letní semestr 2011/2012:
06:00–08:0008:00–10:0010:00–12:0012:00–14:0014:00–16:0016:00–18:0018:00–20:0020:00–22:0022:00–24:00
Po
Út
St
Čt
místnost T2:A4-405
Pollák P.
07:30–09:00
(přednášková par. 1)
Dejvice
Laborator
místnost T2:A4-405
Pollák P.
09:15–10:45
(přednášková par. 1
paralelka 101)

Dejvice
Laborator

Předmět je součástí následujících studijních plánů:
Platnost dat k 9. 7. 2012
Aktualizace výše uvedených informací naleznete na adrese http://bilakniha.cvut.cz/cs/predmet12809704.html