AAAI Publications, The Thirtieth International Flairs Conference

Font Size: 
Document Embedding Strategies for Job Title Classification
Yun Zhu, Faizan Javed, Ozgur Ozturk

Last modified: 2017-05-03


Automatic and accurate classification of items enables numerous downstream applications in many domains. These applications can range from faceted browsing of items to product recommendations and big data analytics. In the online recruitment domain, we refer to classifying job ads to a predefined occupation taxonomy as job title classification. A large-scale job title classification system can power various downstream applications such as query expansion, semantic search, job recommendations and labor market analytics. Such classification systems mostly use Bag-of-Words (BOW) model for document representation and consider only the job titles when classifying job ads. However the BOW model lacks the semantic discrimination capability that is needed to accurately classify job ads when they contain multiple aspects of the job such as the job description, job requirements, company overview and other details. In this paper we explore the applicability of recent advances in the word and document embedding space to the problem of job title classification. We investigate several document embedding approaches and propose a novel customized document embedding strategy for job title classification that addresses the multi-aspect job ad issue. Our experimental results show that incorporating document embedding approaches in a job title classification system improves the classification accuracy on entire job ads compared to approaches based on the BOW model.


Word2Vec; Job Title Classification; Doc2Vec

Full Text: PDF