This paper reports on a shared task involving the assignment of ICD-9-CM codes to radiology reports. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the first freely distributable corpus of fully anonymized clinical text. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large and commercially significant set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.
This paper presents the results from two experiments in which normal-hearing and hearing-impaired listeners used an adaptive procedure to select their preferred frequency response slope and two-channel compression ratios in twenty listening conditions. Whereas the preferred response slope mostly depended on the difference in SNR between frequency bands, the preferred output levels in two channels depended highly on the intensity level entering each band. In both cases, subjects preferred less gain in frequency bands where noise was more intrusive and they preferred less gain for listening comfort than for speech understanding. The preferred response slope also depended on the slope of the audiogram. Relative to the prescribed NAL-RP response, the preferred gain variations improved the broadband SNR and hence listening comfort, but not the estimated speech intelligibility index. Overall, the findings confirm the approach used in many commercial products of applying wide dynamic range compression in multiple bands with additional gain reductions in bands where the noise is estimated to be dominant.
Levin's (1993) study of verb classes is a widely used resource for lexical semantics. In her framework, some verbs, such as give, exhibit no class ambiguity. But other verbs, such as write, have several alternative classes. We extend Levin's inventory to a simple statistical model of verb class ambiguity. Using this model we are able to generate preferences for ambiguous verbs without the use of a disambiguated corpus. We additionally show that these preferences are useful as priors for a verb sense disambiguator.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.