A Multi-Tier NL-Knowledge Clustering for Classifying Students’ Essays

Umarani Pappuswamy, Dumisizwe Bhembe, Pamela W. Jordan, and Kurt VanLehn, University of Pittsburgh

In this paper, we describe a multi-tier Natural Language (NL) clustering approach to text classification for classifying students’ essays for tutoring applications. The main task of the classifier is to map the students’ essay statements into target concepts, namely physics principles and misconceptions. A simple `Bag-Of-Words (BOW)’ classifier using a naïve-Bayes algorithm was unsatisfactory for our purposes as it frequently misclassified due to the semantic relatedness of the NL descriptions of the target concepts. We describe how we used the NL descriptions to define clusters of concepts that reduce the dimensionality of the data when classifying students’ essays. The clustering generated multi-tier tagging schemata (cluster, sub-cluster and class) which led to better classification of the student’s essay.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.