In collaboration with colleagues at UW, OGI, IBM, and SRI, we are developing technology to process spoken language from informal meetings. The work includes a substantial data collection and transcription effort, and has required a nontrivial degree of infrastructure development. We are undertaking this because the new task area provides a significant challenge to current HLT capabilities, while offering the promise of a wide range of potential applications. In this paper, we give our vision of the task, the challenges it represents, and the current state of our development, with particular attention to automatic transcription. D: There's a-there are-there's a whole bunch of tools J: Yes. / D: web page, where they have a listing. D: like 10 of them or something. J: Are you speaking about Mississippi State per se? or D: No no no, there's some .. I mean, there just-there arethere are a lot of / J: Yeah. J: Actually, I wanted to mention-/ D: (??) J: There are two projects, which are .. international .. huge projects focused on this kind of thing, actually .. one of them's MATE, one of them's EAGLES .. and um. D: Oh, EAGLES. D: (??) / J: And both of them have J: You know, I shou-, I know you know about the big book. E: Yeah. J: I think you got it as a prize or something. E: Yeah. / D: Mhm. J: Got a surprise. flaughg f J. thought "as a prize" sounded like "surprise"g Note that interruptions are quite frequent; this is, in our experience, quite common in informal meetings, as is acoustic overlap
In early 2001 we reported (at the Human Language Technology meeting) the early stages of an ICSI project on processing speech from meetings (in collaboration with other sites, principally SRI, Columbia, and UW). In this paper we report our progress from the first few years of this effort, including: the collection and subsequent release of a 75-meeting corpus (over 70 meeting-hours and up to 16 channels for each meeting); the development of a prosodic database for a large subset of these meetings, and its subsequent use for punctuation and disfluency detection; the development of a dialog annotation scheme and its implementation for a large subset of the meetings; and the improvement of both near-mic and far-mic speech recognition results for meeting speech test sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.