Proceedings:
Intelligent Integration and Use of Text, Image, Video, and Audio Corpora
Volume
Issue:
Papers from the 1997 AAAI Spring Symposium
Track:
Contents
Downloads:
Abstract:
This paper presents recent results using statistics generated by a MMl-supervised vector quantizer as a measure of audio similarity. Such a measure has proved successful for talker identification, and the extension from speech to general audio, such as music, is straightforward. A classifier that distinguishes speech from music and non-vocal sounds is presented, as well as experimental results showing how perfect classification accuracy may be achieved on a small corpus using substantially less than two seconds per test audio file. The techniques a presented here may be extended to other applications and domains, such as audio retrieval-by-similarity, musical genre classification, and automatic segmentation of continuous audio.
Spring
Papers from the 1997 AAAI Spring Symposium