AAAI Publications, The Twenty-Eighth International Flairs Conference

Font Size: 
Multilabel Subject-Based Classification of Poetry
Andres Lou, Diana Inkpen, Chris Tanasescu

Last modified: 2016-05-04

Abstract


Oftentimes, the question "what is this poem about?" has no trivial answer, regardless of length, style, author, or context in which the poem is found. We propose a simple system of multi-label classification of poems based on their subjects following the categories and subcategories as laid out by the Poetry Foundation. We make use of a model that combines the methodologies of tf-idf and Latent Dirichlet Allocation for feature extraction, and a Support Vector Machine model for the classification task. We determine how likely it is for our models to correctly classify each poem they read into one or more main categories and subcategories. Our contribution is, thus, a new method to automatically classify poetry given a set and various subsets of categories.

Keywords


automatic text classification; poetry classification

Full Text: PDF