Abstract:
This paper describes a bootstrapping algorithm for acquiring a lexicon of subjective adjectives which minimizes the recourse to external resources (such as lexical databases, parsers, manual annotation work). The method only employs a corpus tagged with part-of-speech information and a seed set of subjective adjectives. The list of candidate subjective adjectives is generated incrementally by looking at the head nouns they modify and computing their distribution-based semantic similarity cosine) with respect to the seed set and its successive extensions. The advantages of a method using limited resources include the following: a) it can be used for languages other than English for which resources such as parsers and annotated corpora are not available, but a part-of-speech tagger is; b) it can be used for English as well when fast and low cost development is required in specific sub-domains of subjective language.