In this paper we claim that a consistent conversational approach to human-computer interaction can be applied feasibly to multimodal interaction. A comprehensive conversational model is presented that covers interrelated levels of the dialogue structure, i.e. illocutionary, rhetorical, and topical aspects. It thus provides the basis for a consistent interpretation of the linguistic as well as graphical actions of both participants during the ongoing dialogue. The model comprises a descriptive part capturing local dialogue tactics and the possible patterns of exchange. In our prototypical information system MERIT we have integrated another prescriptive part that relates sequences of dialogue contributions to global information-seeking strategies and allows a dialogue to be guided by dialogue scripts.