2013
DOI: 10.1136/amiajnl-2012-001110
|View full text |Cite
|
Sign up to set email alerts
|

Identifying medical terms in patient-authored text: a crowdsourcing-based approach

Abstract: Background and objectiveAs people increasingly engage in online health-seeking behavior and contribute to health-oriented websites, the volume of medical text authored by patients and other medical novices grows rapidly. However, we lack an effective method for automatically identifying medical terms in patient-authored text (PAT). We demonstrate that crowdsourcing PAT medical term identification tasks to non-experts is a viable method for creating large, accurately-labeled PAT datasets; moreover, such dataset… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
52
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 66 publications
(53 citation statements)
references
References 23 publications
1
52
0
Order By: Relevance
“…MTurk has also been used for evaluating biomedical informatics research. For instance, Maclean and Heer [31] presented crowdsourcing patient-authored text medical word identification tasks to MTurk non-experts and achieved results that were comparable in quality to those achieved by medical experts. Therefore, we used MTurk for evaluating the similarity within our clusters.…”
Section: Methodsmentioning
confidence: 99%
“…MTurk has also been used for evaluating biomedical informatics research. For instance, Maclean and Heer [31] presented crowdsourcing patient-authored text medical word identification tasks to MTurk non-experts and achieved results that were comparable in quality to those achieved by medical experts. Therefore, we used MTurk for evaluating the similarity within our clusters.…”
Section: Methodsmentioning
confidence: 99%
“…Various medical entity extractors are available for the purpose, but only ADEPT [25] has been specifically trained on medical forums. The algorithm is based on Conditional Random Fields, and the authors have shown that it achieved F 1 score of 0.84 while all the other algorithms that were trained on non-medical forum domains, including MetaMap [1] which is popularly used for literature data, achieved F 1 scores of below 0.5.…”
Section: Medical Entity Extractionmentioning
confidence: 99%
“…Hence, we used a pre-trained machine learning algorithm that extracts only medical-related keywords from patient-authored text [26]. After removing general stop words, our initial keyword set consisted of 4,633 words, which were then used to further compute meaningful measures and to highlight such keywords in different views.…”
Section: Designing Visohcmentioning
confidence: 99%