AAAI Publications, Sixteenth International Conference on Principles of Knowledge Representation and Reasoning

Font Size: 
A Model-Based Approach to Visual Reasoning on CNLVR Dataset
Shailaja Sampat, Joohyung Lee

Last modified: 2018-09-24


Visual Reasoning requires an understanding of complex compositional images and common-sense reasoning about sets of objects, quantities, comparisons, and spatial relationships. This paper presents a semantic parser that combines Computer Vision (CV), Natural Language Processing (NLP) and Knowledge Representation & Reasoning (KRR) to automatically solve visual reasoning problems from the Cornell Natural Language Visual Reasoning (CNLVR) dataset. Unlike the data-driven approaches applied to the same dataset, our system does not require any training but is guided by the knowledge base that is manually constructed. The system demonstrates robust overall performance which is also time and space efficient. Our system achieves 87.3% accuracy, which is 17.6% higher over the state-of-the-art method on raw image representations.


visual reasoning; answer set programming; commonsense reasoning

Full Text: PDF