We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots. Some key takeaways from the competition are: (i) pretrained Transformer variants are currently the best performing models on this task, (ii) but to improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations) in terms of repetition, consistency and balance of dialogue acts (e.g. how many questions asked vs. answered).
The Air Travel Information System (ATIS) domain serves as the common evaluation task for ARPA"spoken language system developers. 1 To support this task, the Multi-Site ATIS Data COllection Working group (MADCOW) coordinates data collection activities. This paper describes recent MADCOW activities. In particular, this paper describes the migration of the ATIS task to a richer relational database and development corpus (ATIS-3) and describes the ATIS-3 corpus. The expanded database, which includes information on 46 US and Canadian cities and 23,457 flights, was released in the fall of 1992, and data collection for the ATIS-3 corpus began shortly thereafter. The ATIS-3 corpus now consists of a total of 8297 released training utterances and 3211 utterances reserved for testing, collected at BBN, CMU, MIT, NIST and SRI. 2906 of the training utterances have been annotated with the correct information from the database. This paper describes the ATIS-3 corpus in detail, including breakdowns of data by type (e.g. context-independent, context-dependent, and unevaluable)and variations in the data collected at different sites. This paper also includes a description of the ATIS-3 database. Finally, we discuss future data collection and evaluation plans.
When auditory material segregates into "streams," is the unattended stream actually organized as an entity? An affirmative answer is suggested by the observation that the organizational structure of the unattended material interacts with the structure of material to which the subject is trying to attend. Specificially, a to-be-rejected stream can, because of its structure, capture from a to-be-judged stream elements that would otherwiise be acceptable members of the to-be-judged stream.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.