Susan E. Brennan and Eric A. Hulteen, Apple Computer
We argue that perfect performance by a speech recognizer is simply not possible, nor should it be the goal. There are limiting factors that are difficult or impossible to control, such as noise in the environment. Moreover, many words and phrases in English are homophones of other words and phrases, so in some situations, both human and machine listeners find them ambiguous. People frequently have trouble sticking to the grammar and vocabulary that a spoken language system expects. Finally, because people have many other demands on them while they are speaking such as planning what to say next and monitoring their listeners and the environment, they frequently do not produce the kind of fluent speech that a recognizer has been trained to process.