Trausti T. Kristjansson and Brendan J. Frey
We introduce a new paradigm for Robust Automatic SpeechRecognition that directly incorporates information about the uncertainty introduced by environmental noise. In contrast to the feature cleaning and model adaptation paradigms, where the noise compensation mechanism is separate from the recognizer, the new paradigm unifies the noise compensation mechanism and the recognizer. The Algonquin framework serves to demonstrate the importance of retaining soft information, i.e. information about the degree of uncertainty in the observations. The Algonquin framework employs Gaussian mixture models to model both noise and speech. Uncertainty introduced by the noise process is captured by the variance of the noise model. The Algonquin framework also allows us to isolate the effect of retaining or discarding soft information. Our initial results indicate that substantial improvements in recognition rates can be achieved through the use of soft information.
Accounting for Uncertainty in Observations: A New Paradigm for Robust Automatic Speech Recognition
Leave a Reply