Published:
2020-06-02
Proceedings:
Proceedings of the AAAI Conference on Artificial Intelligence, 34
Volume
Issue:
Vol. 34 No. 10: Issue 10: AAAI-20 Student Tracks
Track:
Student Abstract Track
Downloads:
Abstract:
In our research, we propose a new multimodal fusion architecture for the task of sentiment analysis. The 3 modalities used in this paper are text, audio and video. Most of the current methods deal with either a feature level or a decision level fusion. In contrast, we propose an attention-based deep neural network and a training approach to facilitate both feature and decision level fusion. Our network effectively leverages information across all three modalities using a 2 stage fusion process. We test our network on the individual utterance based contextual information extracted from the CMU-MOSI Dataset. A comparison is drawn between the state-of-the-art and our network.
DOI:
10.1609/aaai.v34i10.7173
AAAI
Vol. 34 No. 10: Issue 10: AAAI-20 Student Tracks
ISSN 2374-3468 (Online) ISSN 2159-5399 (Print) ISBN 978-1-57735-835-0 (10 issue set)
Published by AAAI Press, Palo Alto, California USA Copyright © 2020, Association for the Advancement of Artificial Intelligence All Rights Reserved