2014
DOI: 10.1136/amiajnl-2013-001837
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating the impact of pre-annotation on annotation speed and potential bias: natural language processing gold standard development for clinical named entity recognition in clinical trial announcements

Abstract: ObjectiveTo present a series of experiments: (1) to evaluate the impact of pre-annotation on the speed of manual annotation of clinical trial announcements; and (2) to test for potential bias, if pre-annotation is utilized.MethodsTo build the gold standard, 1400 clinical trial announcements from the clinicaltrials.gov website were randomly selected and double annotated for diagnoses, signs, symptoms, Unified Medical Language System (UMLS) Concept Unique Identifiers, and SNOMED CT codes. We used two dictionary-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
52
1

Year Published

2014
2014
2018
2018

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 68 publications
(53 citation statements)
references
References 15 publications
0
52
1
Order By: Relevance
“…Using our corpus in combination with another corpus means that the other corpus might not need to be as large as it would if used on its own. Also, our corpus can be used to train a system and pre-annotate a new corpus to speed up the manual annotation process [24]. An interesting direction for future work is to investigate domain adaptation methods to improve the machine learning algorithms so that they are less dependent on the underlying dataset to achieve cross-corpora performance closer to in-corpus training experiments.…”
Section: Discussionmentioning
confidence: 99%
“…Using our corpus in combination with another corpus means that the other corpus might not need to be as large as it would if used on its own. Also, our corpus can be used to train a system and pre-annotate a new corpus to speed up the manual annotation process [24]. An interesting direction for future work is to investigate domain adaptation methods to improve the machine learning algorithms so that they are less dependent on the underlying dataset to achieve cross-corpora performance closer to in-corpus training experiments.…”
Section: Discussionmentioning
confidence: 99%
“…In the clinical domain Lingren et al [11] demonstrate that pre-annotation reduces the annotation time significantly for the NE annotation layer. As an additional result they reviled that pre-annotation did not influence the Inter Annotator Agreement (IAA) or annotator performance.…”
Section: Related Workmentioning
confidence: 99%
“…Lingren et al [11], Loftsson et al [12], and Fort and Sagot [13] investigate the annotation process with pre-annotated corpora and they show that annotation time can be reduced. We reuse the idea of pre-annotation in the semi-automatic annotation process of the QPT (c. f. Section 3.2), resulting in higher annotation speed (c. f. Section 4).…”
Section: Related Workmentioning
confidence: 99%
“…For the task of medical named entity labeling, Lingren et al (2013) investigate the impact of automatic suggestions on annotation speed and potential biases using dictionary-based annotations. This technique results in 13.83% to 21.5% time saving and in an inter-annotator agreement (IAA) increase by several percentage points.…”
Section: Related Workmentioning
confidence: 99%