Implementing Cross-Language Text Retrieval Systems for Large-Scale Text Collections and the World Wide Web

Mark W. Davis and William C. Ogden

QUILT (Query User Interface with Light Translations) is prototype implementation of a complete cross-language text retrieval system that takes English queries and produces English gloss translations of Spanish documents. The system indexes the Spanish documents in Spanish, but converts the English query into a Spanish equivalent set through a novel combination of lexical methods and parallel-corpus disambiguatinn. Similar methods are applied to the returned document to produce a simple translation that can be examined by non-Spanish speakers to gauge the relevance of the document to the original English query. The system integrates traditional, glossary-based machine txanslation technology with information retrieval approaches and demonstrates that relatively simple term substitution and disambiguation approaches can he viable for cross-language text retrieval. Components of QUILT have been used to build a CLTR interface to WWW-based search services.


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.