AAAI Publications, Twenty-Second International FLAIRS Conference

Font Size: 
Systematic Evaluation of Convergence Criteria in Iterative Training for NLP
Patricia Brent, Nathan David Green, Paul Breimyer, Ramya Krishnamurthy, Nagiza F. Samatova

Last modified: 2009-04-01


Natural Language Processing (NLP) tasks, such as Named Entity Recognition (NER), involve an iterative process of model optimization to identify different types of words or semantic entities. This optimization to achieve a more precise model becomes computationally difficult as the number of iterations increase. The small datasets available for training typically limit the models. Adding iterations on such sets to further optimize the model can often cause over-fitting, which generally leads to reduced performance. Therefore, the choice of convergence criteria is a critical step in robust and accurate model building. We evaluate different convergence criteria in terms of their robustness, stopping threshold selection, and independence from the training data size and entity. The underlying framework employs a limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) parameter optimization in the context of Conditional Random Fields (CRF). This paper presents a convergence criterion for robust training irrespective of semantic types and data sizes with two-orders of magnitude reduction in stopping threshold for improved model accuracy and faster convergence. Additionally, we examine convergence with active learning to further reduce the training data and training time.

Full Text: PDF