2022
DOI: 10.2196/29803
|View full text |Cite
|
Sign up to set email alerts
|

Identification of Prediabetes Discussions in Unstructured Clinical Documentation: Validation of a Natural Language Processing Algorithm

Abstract: Background Prediabetes affects 1 in 3 US adults. Most are not receiving evidence-based interventions, so understanding how providers discuss prediabetes with patients will inform how to improve their care. Objective This study aimed to develop a natural language processing (NLP) algorithm using machine learning techniques to identify discussions of prediabetes in narrative documentation. Methods We developed… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
16
0
5

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(22 citation statements)
references
References 45 publications
1
16
0
5
Order By: Relevance
“…Table 2 shows the six basic publication attributes of the complete set of 12 studies included following the completion of the identi cation and screening processes described above (15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26). The years of publication in the included studies range from 1996 to 2022 with 10 studies published between 2017 and 2022 (15,16,(18)(19)(20)(21)(22)(23)(24)26).…”
Section: Basic Publication Attributes Of Included Studiesmentioning
confidence: 99%
See 3 more Smart Citations
“…Table 2 shows the six basic publication attributes of the complete set of 12 studies included following the completion of the identi cation and screening processes described above (15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26). The years of publication in the included studies range from 1996 to 2022 with 10 studies published between 2017 and 2022 (15,16,(18)(19)(20)(21)(22)(23)(24)26).…”
Section: Basic Publication Attributes Of Included Studiesmentioning
confidence: 99%
“…The variety of current general practice problems identi ed concern appointment scheduling (15-17, 20, 21), teleconsultation (18), care management (19,24), communication (22), healthcare recommender systems (23), user interaction with electronic medical records (25), and resource management through scheduling (26), with the most frequently occurring problem being appointment scheduling. Similarly, the data reportedly used in all studies differs both across all identi ed problems and within the same problem from different researchers, with sources largely consisting of proprietary data taken from a variety of domains, including actual general practice clinics (15,17,(20)(21)(22)(23)(24), published clinical guidelines (19), electronic healthcare databases (16), and teleconsultation recordings (18), that differ considerably in their features. In looking at the level of involvement of GPs across all studies, it is not always clearly stated to what extent they participate in the actual research and only two studies clearly state involvement of GPs (18, 26).…”
Section: Journal Articlementioning
confidence: 99%
See 2 more Smart Citations
“…Existing approaches for extracting measurements from clinical text are often based on manually developed heuristics or machine learning methods that learn from labeled data but do not leverage pretrained language representations. Rule-based approaches [ 4 ], while computationally efficient, require substantial manual effort to construct and can suffer performance degradation with shifts in linguistic structure of reports [ 5 ]. Other work has used machine learning approaches such as support vector machines and long short-term memory models to extract measurements from clinical notes, but these approaches have required large quantities of expert annotations due to absence of pretraining [ 6 ].…”
Section: Introductionmentioning
confidence: 99%