Metadata for Integrating Chinese Text and Speech Documents in a Multi-media Retrieval System

Yue-Shi Lee and Hsin-Hsi Chen

Multimedia documents place new requirements on the conventional text retrieval systems. This paper presents a multimedia retrieval system that employs the content-based strategy to retrieve both text and speech documents. Its input can be a sequence of spoken words which are digitized waveforms or a sequence of characters, and its output is a list of ranked text and/or speech documents. In this system, a new metadata especially designed for both text and speech documents is proposed. The metadata is automatically generated with special consideration of the characteristics of Chinese. The presented approach is very easy to implement and the preliminary tests give very encouraging results.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.