Abstract.The large available amount of non-structured texts that be-long to different domains such as healthcare (e.g. medical records), justice (e.g. laws, declarations), insurance (e.g. declarations), etc. increases the effort required for the analysis of information in a decision making pro-cess. Different projects and tools have proposed strategies to reduce this complexity by classifying, summarizing or annotating the texts. Partic-ularly, text summary strategies have proven to be very useful to provide a compact view of an original text. However, the available strategies to generate these summaries do not fit very well within the domains that require take into consideration the temporal dimension of the text (e.g. a recent piece of text in a medical record is more important than a pre-vious one) and the profile of the person who requires the summary (e.g the medical specialization). To cope with these limitations this paper presents "GReAT" a model for automatic summary generation that re-lies on natural language processing and text mining techniques to extract the most relevant information from narrative texts and discover new in-formation from the detection of related information. GReAT Model was implemented on software to be validated in a health institution where it has shown to be very useful to display a preview of the information about medical health records and discover new facts and hypotheses within the information. Several tests were executed such as Functional-ity, Usability and Performance regarding to the implemented software. In addition, precision and recall measures were applied on the results ob-tained through the implemented tool, as well as on the loss of information obtained by providing a text more shorter than the original. Keywords:Topic Identification, User's Profile, Text Mining, Automatic Summary Generation Techniques, Semantic Relations This work was supported by the project "Extraccion semi-autom´atica de metadatos de fuentes de datos estructuradas: Una aproximacion basada en agentes y miner´ıa de datos" funded by Banco Santander S.A. and Pontificia Universidad Javeriana. 1.IntroductionDuring the last thirty years the information systems have stored huge amounts of information in different formats, in some areas or domains such as healthcare (e.g. medical records), justice (e.g. sworn declarations), assurance (e.g. declara-tion) and insurance (e.g. Research articles and reports). A lot of this information is stored as narrative texts, hindering its use for the decision making processes. The process of discovering the knowledge contained in these texts, or creating new hypotheses according to them include a lot of time and effort [14], [24] that cannot be afforded by most organizations.Generally, this problem remains a constant challenge due to the difficulty of organizations to absorb and use the information they need. It should be noted that, the limits on the reading speed of a human being make it impossible to capture the key information in a short time when there is a larg...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.