Abstract-A parser for medical free text reports has been developed that is based on a chemistry/physics inspired "field theory" for word-word sentence-level dependencies. The transition from the linguistic world to the world of interacting particles with potential energies is guided by a psycholinguistics thought experiment related to the amount of "work" required to bring a reference word into an anchored configuration of words. Calibration experiments involving four and five grams were conducted. Data from these experiments were used as a knowledge source for estimating field conditions for words in sentences sampled from a corpus of medical reports. The result of the parser is a dependency tree that represents the global minimum energy state of the system of words for a given sentence. The system was trained and tested on a corpus of radiology reports. Preliminary performance, as quantified by link recall and precision statistics, is 84.9% and 89.9%, respectively. Index Terms-Knowledge representation, natural language processing (NLP), structured medical reporting.
A patient's electronic medical record contains a large amount of unstructured textual information. As patient records become increasingly dense owing to an aging population and increased occurrence of chronic diseases, a tool is needed to help organize and navigate patient data in a way that facilitates a clinician's ability to understand this information and that improves efficiency. A system has been developed for physicians that summarizes clinical information from a patient record. This system provides a gestalt view of the patient's record by organizing information about each disease along four dimensions (axes): time (eg, disease progression over time), space (eg, tumor in left frontal lobe), existence (eg, certainty of existence of a finding), and causality (eg, response to treatment). A display is generated from information provided by radiology reports and discharge summaries. Natural language processing is used to identify clinical abnormalities (problems, symptoms, findings) from these reports as well as associated properties and relationships. This information is presented in an integrated format that organizes extracted findings into a problem list, depicts the information on a timeline grid, and provides direct access to relevant reports and images. The goal of this system is to improve the structure of clinical information and its presentation to the physician, thereby simplifying the information retrieval and knowledge discovery necessary to bridge the gap between acquiring raw data and making an informed diagnosis.
Medical concepts in clinical reports can be found with a high degree of variability of expression. Normalizing medical concepts to standardized vocabularies is a common way of accounting for this variability. One of the challenges in medical concept normalization is the difficulty in comparing two concepts which are orthographically different in representation but are identical in meaning. In this work we describe a method to compare medical phrases by utilizing the information found in syntactic dependencies. We collected a large corpus of radiology reports from our university medical center. A shallow semantic parser was used to identify anatomical phrases. We performed a series of transformations to convert the anatomical phrase into a normalized syntactic dependency representation. The new representation provides an easy intuitive way of comparing the phrases for the purpose of concept normalization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.