Public reporting burden for this collection of Information Is estimated to average 1 hour per response, Including the time for reviewing Instructions, searching data sources, gathering and maintaining the data needed, and completing and reviewing the collection of Infomnation. We describe an approach to natural 3D multimodal interaction in immersive environments. Our approach fuses symbolic and statistical information from a set of 3D gesture and speech agents, building in part on prior research on disambiguating the user's intent in 2D and 2.5D user interfaces. We present an experimental system architecture that embodies this approach, and provide examples from a preliminary 3D multimodal testbed to explore our ideas in augmented and virtual reality.
Accomplishments:This project accomplished ground-breaking work in the integration of multiple modalities and knowledge sources into 3D augmented reality environments. Working with Columbia University, we extended the basic 2D architecture developed at OGI by incorporating 3D gesture recognition and information from reasoning about a scene into a full 3D multimodal augmented reality system. The system uses a detailed 3D model of the Columbia University Graphics Laboratory, including the physical objects within the lab, such as tables, chairs, walls, etc. The user's body is tracked using either a Flock of Birds magnetic tracker, or an Ascension IS900 wireless tracker. Extensions were made to the AAA multiagent architecture to enable it to support high-volume point-to-point communication. This work is detailed in the attached draft paper, which is being revised for publication.Among the other accomplishments of this project were:-Development of a 3-level recognition architecture (Members-Teams-Committee), which supports statistical and symbolic fusion for multimodal systems. This recognition architectiire was shown to offer superior error handling (reducing error rates approx. 30%) over the existing symbolic-integration-only system. This recognition architecture was also deployed as a pen-based gesture recognizer for military symbols.-Integration of the gesture recognizer into the first tangible multimodal system (Rasa). Rasa enables a military user to continue to employ his highly trained work style (using paper maps and Post-It notes), while the user's multimodal input is digitized simultaneously. Often, warfighters ignore the computers in their environment, preferring to employ paper-based tools. The Rasa system provides both the benefits of paper and digital systems. The system has been tested with members of the USMC and the US Army National Guard, and foimd to be preferred to paper alone, and to be robust to power and computer failures.
Publications:Cohen We describe an approach to natural 3D muitimodal interaction in immersive environments. Our approach fuses symbolic and statistical information from a set of 3D gesture and speech agents, building in part on prior research on disambiguating the user's intent in 2D and 2.5D user interfaces. We present an experi...