Track:
Contents
Downloads:
Abstract:
In this paper, we propose a technique for constructing bilingual collocation dictionaries completely automatically. The technique we propose first identifies a set of collocations in one language and then attempts to translate them using the Hansards as waining data. To do this, we propose to use Xlract, a collocation compiler [Smadja 92], to identify collocations and to use mutual information statistics to Iranslate the collocations into the other language. The algorithm we describe is an iterative method that builds the Iranslation of a given collocation by adding words one by one. This technique allows a collocation containing n words to be translated into a collocation of p words. The paper describes the proposed algorithm and shows how it is applied in the translation of the following three collocations: "senior citizen," "Madam Speaker," and "election campaign."