2019
DOI: 10.1186/s12911-019-0863-3
|View full text |Cite
|
Sign up to set email alerts
|

Building a tobacco user registry by extracting multiple smoking behaviors from clinical notes

Abstract: Background Usage of structured fields in Electronic Health Records (EHRs) to ascertain smoking history is important but fails in capturing the nuances of smoking behaviors. Knowledge of smoking behaviors, such as pack year history and most recent cessation date, allows care providers to select the best care plan for patients at risk of smoking attributable diseases. Methods We developed and evaluated a health informatics pipeline for identifying complete smoking history… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 41 publications
0
12
0
Order By: Relevance
“…Artificial intelligence and deep learning applications are making their way into clinical practice, but their efficacy is not yet proven in prospective trial settings. 16 The current study utilized more advanced deep learning compared to previous reports 18 , 19 , 20 in a retrospective proof-of-principle setting, where we successfully extracted a large amount of smoking data in a matter of days. Two language models were tested with good specificity (88%-98%), comparable to the previous results in English language.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Artificial intelligence and deep learning applications are making their way into clinical practice, but their efficacy is not yet proven in prospective trial settings. 16 The current study utilized more advanced deep learning compared to previous reports 18 , 19 , 20 in a retrospective proof-of-principle setting, where we successfully extracted a large amount of smoking data in a matter of days. Two language models were tested with good specificity (88%-98%), comparable to the previous results in English language.…”
Section: Discussionmentioning
confidence: 99%
“…24 The learned knowledge was then transferred to a training classifier, by randomly picking 5000 tobacco smoking-related sample phrases and sentences from the medical narrative archive of our hospital, using the Finnish word-stem ‘tupak’ equivalent to the English word-stem ‘smok’. 19 These sample phrases were manually labeled into three classes (never, former, or current smoker). ULMFiT- and BERT-based classification models were then trained on this data to produce smoking phrase classifiers.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…OSCG were subjected to biopsy to obtain diagnostic certainty. The data on tobacco consumption was made according the number of pack year smoked; a subject that smoked fifteen or more pack years in twenty years was considered a smoker (13). One alcohol unit a day (one drink) was considered as regular alcohol consumption (14).…”
Section: Methodsmentioning
confidence: 99%