User Tools

Site Tools


Abstract: A major problem of state of the art systems for automatic speech recognition (ASR) is their lack of robustness, manifested as a dramatic performance degradation in the presence of additive noise or linear filtering. The first step in all speech recognition algorithms is to represent consecutive segments of speech using low-dimensional feature vectors, the purpose of which is to remove any variability and redundancy which is believed to be irrelevant to recognition. However, it is unclear whether in this process of dimension reduction, and peeling off of what seems to be speech component unnecessary for recognition, one is not discarding the information which makes speech signals such a robust code of messages they convey. In this talk we will address the issue of the importance of redundancy in speech signals for robust ASR. To that end we build generative and discriminative models of speech in high dimensional spaces of acoustic waveforms, and demonstrate increased robustness in phoneme recognition as a result of posing the problem in these original high-dimensional spaces.

Biography: Zoran Cvetkovic is Professor of Signal Processing at King's College London. He received his Dipl. Ing. and Mag. degrees from the University of Belgrade, Yugoslavia, the M.Phil. from Columbia University, and the Ph.D. in electrical engineering from the University of California, Berkeley. He held research positions at EPFL, Lausanne, Switzerland (1996), and at Harvard University (2002-04). Between 1997 and 2002 he was a member of the technical staff of AT&T Shannon Laboratory. His research interests are in the broad area of signal processing, ranging from theoretical aspects of signal analysis to applications in audio and speech technologies and biomedical engineering. From 2005 to 2008 he served as an Associate Editor of IEEE Transactions on Signal Processing.

redundancy_in_speech_and_robustness_of_automatic_speech_recognition.txt · Last modified: 2016/09/01 19:15 (external edit)