Abstract. Multimodal interaction provides the user with multiple modes of interacting with a system, such as gestures, speech, text, video, audio, etc. A multimodal system allows for several distinct means for input and output of data. In this paper, we present our work in the context of the I-SEARCH project, which aims at enabling context-aware querying of a multimodal search framework including real-world data such as user location or temperature. We introduce the concepts of MuSeBag for multimodal query interfaces, UIIFace for multimodal interaction handling, and CoFind for collaborative search as the core components behind the I-SEARCH multimodal user interface, which we evaluate via a user study.