Building Multimodal Systems: Compromise between Theory and Practice

Marilyn Cross, Christian Matthiessen, Licheng Zeng, and Ichiro Kobayashi

In the context of building a system that assists knowledge workers to process multimodal information sources in the domain of communicable diseases, the compromise between the theoretical ideal of a unifying representation and the compromises required for the pragmatics of the application will be discussed. The HINTS application assists information analysts to retrieve relevant documents from multiple sources, extract information form those documents and generate multimodal presentations in areas of interest. The foundation research question that guided the research and design of the prototype was the possibility of unifying semantics across different semiotic systems which are instantiated in different modalities, viz, the linguistic versus the visual semiotic for this exploration. For the application domain in which language carries the wider range of meanings and where the visual semiotic is complementary, the premise was explored that a theoretical model of the semiotics of language might be used to unify the semantics across different modalities. An analysis of the domain showed the premise was viable. The subsequent design and implementation of the prototype highlighted the dialectic between meaning potential and instantiation and how the change in balance from retrieval, to extraction and to generation needed to be managed computationally, as well as theoretically.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.