Assessing Entailer with a Corpus of Natural Language From an Intelligent Tutoring System

Philip M. McCarthy, Vasile Rus, Scott A. Crossley, Sarah C. Bigham, Arthur C. Graesser, and Danielle S. McNamara

In this study, we compared Entailer, a computational tool that evaluates the degree to which one text is entailed by another, to a variety of other text relatedness metrics (LSA, lemma overlap, and MED). Our corpus was a subset of 100 self-explanations of sentences from a recent experiment on interactions between students and iSTART, an Intelligent Tutoring System that helps students to apply metacognitive strategies to enhance deep comprehension. The sentence pairs were hand coded by experts in discourse processing across four categories of text relatedness: entailment, implicature, elaboration, and paraphrase. A series of regression analyses revealed that Entailer was the best measure for approximating these hand coded values. Entailer could explain approximately 50% of the variance for entailment, 38% of the variance for elaboration, and 23% of the variance for paraphrase. LSA contributed marginally to the entailment model. Neither lemma-overlap nor MED contributed to any of the models, although a modified version of MED did correlate significantly with both the entailment and paraphrase hand coded evaluations. This study is an important step towards developing a set of indices designed to better assess natural language input by students in Intelligent Tutoring Systems.

Subjects: 1. Applications; 13. Natural Language Processing

Submitted: Feb 8, 2007


This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.