AAAI Publications, Twenty-Fourth International FLAIRS Conference

Improving Spoken Dialogue Understanding Using Phonetic Mixture Models
William Yang Wang, Ron Artstein, Anton Leuski, David Traum

Last modified: 2011-03-20


Augmenting word tokens with a phonetic representation, derived from a dictionary, improves the performance of a Natural Language Understanding component that interprets speech recognizer output: we observed a 5% to 7% reduction in errors across a wide range of response return rates. The best performance comes from mixture models incorporating both word and phone features. Since the phonetic representation is derived from a dictionary, the method can be applied easily without the need for integration with a specific speech recognizer. The method has similarities with autonomous (or bottom-up) psychological models of lexical access, where contextual information is not integrated at the stage of auditory perception but rather later.

