What Kind of Information is Necessary for NLP and MT?

Makoto Nagao

Researchers in natural language processing (NLP) and machine translation (MT) in the past were mainly interested in linguistic theories, parsing and generation. They discussed specific types of language expressions such as garden path sentences, and did not pay much attention to problems v0hich may arise when huge volmne of real existing sentences are handled, such as newspaper articles, patent documents, and so on. Once we get into this area of processing large text corpus, we confront with several problems such as: we have to write a complete set of grammatical rules, we have to prepare a comprehensive dictionary, and so on. A major probleln here is not in the linguistically interesting but seldom-appearing linguistic phenomena, but in the average success rate of parsing, generation etc. for a large text corpus, which seldom includes such sophisticated sentential structures as garden path sentences. It includes different types of difficult problems, for example, parsing of long sentences such as sentences COlnposed of more than thirty words, and building a good lexicon.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.