Tagging as a Means of Refining and Extending Syntactic Classes

Catherine Macleod, Adam Meyers, and Ralph Grishman

Comlex Syntax is a moderately-broad-coverage English lexicon (with about 38,000 root forms) being developed at New York University under contract to the Linguistic Data Consortium; the first version of the lexicon was delivered in May 1994. The lexicon is available to members of the Linguistic Data Consortium for both research and commercial applications. It was developed for use in processing natural language by computer. Comlex Syntax is particularly detailed in its treatment of subcategorization (complement structures). It includes 92 different subcategorizat ion features for verbs, 14 for adjectives, and 9 for nouns. These distinguish not only the different constituent structures which may appear in a complement, but also the different control features associated with a constituent structure. In order to make this dictionary useful to the entire NLP community, an effort has been made to provide detailed yet theory neutral syntactic information. In part, this involved using categories that are generally recognized, i.e. nouns, verbs, adjectives, prepositions, adverbs, and their corresponding phrasal expansions np, vp, adjp, pp, advp. COMLEX cites the specific prepositions and adverbs in prepositional and particle phrases.1


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.