Proceedings:
No. 1: Thirty-First AAAI Conference On Artificial Intelligence
Volume
Issue:
Proceedings of the AAAI Conference on Artificial Intelligence, 31
Track:
Doctoral Consortium
Downloads:
Abstract:
Image Understanding is fundamental to intelligent agents.Researchers have explored Caption Generation and VisualQuestion Answering as independent aspects of Image Understanding (Johnson et al. 2015; Xiong, Merity, and Socher2016). Common to most of the successful approaches, are the learning of end-to-end signal mapping (image-to-caption, image and question to answer). The accuracy is impressive. It is also important to explain a decision to end-user(justify the results, and rectify based on feedback). Very recently, there has been some focus (Hendricks et al. 2016;Liu et al. ) on explaining some aspects of the learning systems. In my research, I look towards building explainableImage Understanding systems that can be used to generate captions and answer questions. Humans learn both from examples (learning) and by reading (knowledge). Inspired by such an intuition, researchers have constructed Knowledge-Bases that encode (probabilistic) commonsense and background knowledge. In this work, we look towards efficiently using this probabilistic knowledge on top of machine learning capabilities, to rectify noise in visual detections and generate captions or answers to posed questions.
DOI:
10.1609/aaai.v31i1.10519
AAAI
Proceedings of the AAAI Conference on Artificial Intelligence, 31