The DARPA Spoken Language System (SLS) community has long taken a leadership position in designing, implementing, and globally distributing significant speech corpora widely used for advancing speech recognition research. The Wall Street Journal (WSJ) CSR Corpus described here is the newest addition to this valuable set of resources. In contrast to previous corpora, the WSJ corpus will provide DARPA its first general-purpose English, large vocabulary, natural language, high perplexity, corpus containing significant quantities of both speech data (400 hrs.) and text data (47M words), thereby providing a means to integrate speech recognition and natural language processing in application domains with high potential practical value. This paper presents the motivating goals, acoustic data design, text processing steps, lexicons, and testing paradigms incorporated into the multi-faceted WSJ CSR Corpus.
Algorithms which are based on modeling speech as a finite-state, hidden Markov process have been very successful in recent years. This paper presents a generalization of these algorithms to certain denumerable-state, hidden Markov processes. This algorithm permits automatic training of the stochastic analog of an arbitrary context free grammar. In particular, in contrast to many grammatical inference methods, the new algorithm allows the grammar to have an arbitrary degree of ambiguity. Since natural language is often syntactically ambiguous, it is necessary for the grammatical inference algorithm to allow for this ambiguity. Furthermore, allowing ambiguity in the grammar allows errors in the recognition process to be explicitly modeled in the grammar rather than added as an extra component.
How the brain encodes the semantic concepts represented by words is a fundamental question in cognitive neuroscience. Hemodynamic neuroimaging studies have robustly shown that different areas of posteroventral temporal lobe are selectively activated by images of animals versus manmade objects. Selective responses in these areas to words representing animals versus objects are sometimes also seen, but they are task-dependent, suggesting that posteroventral temporal cortex may encode visual categories, while more anterior areas encode semantic categories. Here, using the spatiotemporal resolution provided by intracranial macroelectrode and microelectrode arrays, we report category-selective responses to words representing animals and objects in human anteroventral temporal areas including inferotemporal, perirhinal and entorhinal cortices. This selectivity generalizes across tasks and sensory modalities, suggesting that it represents abstract lexico-semantic categories. Significant category-specific responses are found in measures sensitive to synaptic activity (local field potentials, high gamma power, current sources and sinks) and unit-firing (multi- and single-unit activity). Category-selective responses can occur at short latency, as early as 130ms, in middle cortical layers and thus are extracted in the first-pass of activity through the anteroventral temporal lobe. This activation may provide input to posterior areas for iconic representations when required by the task, as well as to the hippocampal formation for categorical encoding and retrieval of memories, and to the amygdala for emotional associations. More generally, these results support models in which the anteroventral temporal lobe plays a primary role in the semantic representation of words.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.