AAAI Publications, The Thirtieth International Flairs Conference

Font Size: 
Complexity Guided Noise Filtering in QA Repositories
K. V. S. Dileep, Swapnil Hingmire, Sutanu Chakraborti

Last modified: 2017-05-03


Filtering out noisy sentences of an answer which are irrelevant to the question being asked increases the utility and reuse of a Question-Answer (QA) repository. Filtering such sentences might be difficult for traditional supervised classification methods due to the extensive labelling efforts involved. In this paper, we propose a semi-supervised learning approach, where we first infer a set of topics on the corpus using Latent Dirichlet Allocation (LDA). We label the topics automatically using a small labelled set and use them for classifying an unseen sentence as useful or noisy. We performed the experiments on a real-life help desk dataset and find that the results are comparable to other methods in semi-supervised learning.

Full Text: PDF