AAAI Publications, Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence

Font Size: 
Detecting Personal Experience Tweets for Health Surveillance Using Unsupervised Feature Learning and Recurrent Neural Networks
Shichao Feng, Keyuan Jiang, Jiyun Li, Ricardo A. Calix, Matrika Gupta

Last modified: 2018-06-22

Abstract


Given its easy accessibility and prevalence, Twitter has been actively used as an alternative data source for health surveillance research, and personal health experiences play an important role in such surveillance activities. Therefore, there is a need to develop efficient and effective methods to identify Twitter posts related to personal health experiences. In this work, we present a method which combines word embeddings, convolutional, and Long Short-Term Memory (LSTM) recurrent neural networks to detect personal health experience tweets. The word embedding and convolutional layers serve as a pre-processing step for unsupervised feature learning. This step helps to eliminate the need for feature engineering. We studied three distributed word representation methods: word2vec, fastText, and WordRank to represent the tweet texts in a vector space model. Vectors of the word representations were later used in a convolution layer for further pre-processing, and were fed to an LSTM based Recurrent Neural Network (RNN) model for classification. Our results showed that approach outperforms, with a significant margin, conventional classifiers that used human engineered features. The RNN based model had a significant improvement in precision compared to the other methods (by 123%). This improvement helps to detect more true positive Personal Health Experience tweets.

Keywords


Health surveillance; Personal health experience; Social media, Twitter; Word embeddings; Deep learning; Convolutional neural network; Long short-term memory neural network; Unsupervised feature learning

Full Text: PDF