Lexicon Development and POS Tagging using a Tagged Bengali News Corpus

Asif Ekbal, Sivaji Bandyopadhyay

Lexicon development and Part of Speech (POS) tagging are very important for almost all Natural Language Processing( NLP) application areas. The rapid development of these resources and tools using machine learning techniques for less computerized languages requires appropriately tagged corpus. A tagged Bengali news corpus has been developed from the web archive of a widely read Bengali newspaper. This corpus is then used for lexicon development and POS tagging.

Subjects: 13. Natural Language Processing

Submitted: Feb 11, 2007

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.