2019
DOI: 10.1101/19009118
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Comprehensive Typing System for Information Extraction from Clinical Narratives

Abstract: We have developed ACROBAT (Annotation for Case Reports using Open Biomedical Annotation Terms), a typing system for detailed information extraction from clinical text. This resource supports detailed identification and categorization of entities, events, and relations within clinical text documents, including clincal case reports (CCRs) and the free-text components of electronic health records. Using ACROBAT and the text of 200 CCRs, we annotated a wide variety of real-world clinical disease presentations. The… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
16
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 13 publications
(16 citation statements)
references
References 12 publications
0
16
0
Order By: Relevance
“…The authors in [ 32 – 34 ] ensemble conditional random fields [ 35 ] with convolutional neural networks [ 36 ] or recurrent neural networks [ 37 ], requiring extensive human annotation effort at the training stage which is expensive and time-consuming. We thus collect datasets from multiple tasks including I2B2-2010 [ 38 ], CORD-NER [ 39 ] and MACCROBAT2018 [ 40 ] and jointly fine-tune a deep language model to encode the tokens from the social media data. One layer of the Feed Forward Network (FNN) [ 41 ] with softmax [ 42 ] takes the hidden representations of each token as input and outputs the category of this token.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The authors in [ 32 – 34 ] ensemble conditional random fields [ 35 ] with convolutional neural networks [ 36 ] or recurrent neural networks [ 37 ], requiring extensive human annotation effort at the training stage which is expensive and time-consuming. We thus collect datasets from multiple tasks including I2B2-2010 [ 38 ], CORD-NER [ 39 ] and MACCROBAT2018 [ 40 ] and jointly fine-tune a deep language model to encode the tokens from the social media data. One layer of the Feed Forward Network (FNN) [ 41 ] with softmax [ 42 ] takes the hidden representations of each token as input and outputs the category of this token.…”
Section: Methodsmentioning
confidence: 99%
“…determining whether a relation exists between two recognized entities. We aggregate datasets from multiple tasks including Wiki80 [ 50 ], I2B2-2012 [ 51 ] and MACCROBAT2018 [ 40 ] to generate the positive instances, i.e. sentences containing two entities and a True relation between them.…”
Section: Methodsmentioning
confidence: 99%
“…Without the loss of generality, we leverage BERT model to provide contextualized embeddings and learn a supervised named entity recognizer. To overcome the problem with the nonexistence of annotated tweets as training data, we collect the benchmark corpora and their annotations for multiple NER tasks, including I2B2-2010 [20], CORD-NER [78] and MACCROBAT-2018 [12]. Based on those external datasets, we jointly learn a recognition model to extract entities on the COVID-19 related tweets data.…”
Section: Constructing Dynamic Knowledge Graphs From Social Media Datamentioning
confidence: 99%
“…To overcome the above challenge, we convert the multi-class prediction task to a binary classification problem of only identifying the existence of a potential relationship between any entity pair in each tweet instance. We aggregate datasets from multiple tasks including Wiki80 [26], I2B2-2012 [73], and MAACROBAT-2018 [12] to create the positive training data (labeled as 'True'). In order to achieve balanced training, validation and test datasets, we apply negative sampling to create the same number of instances with the label 'False'.…”
Section: Constructing Dynamic Knowledge Graphs From Social Media Datamentioning
confidence: 99%
“…Case reports are a time-honored means of sharing observations and insights about novel patient cases [1], [2]. As of 2020, at least 160 case report journals were in existence, with over 90% having open access policies and almost half indexed by PubMed [3].…”
Section: Introductionmentioning
confidence: 99%