2023
DOI: 10.1101/2023.07.07.23292391
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Assessing GPT-3.5 and GPT-4 in Generating International Classification of Diseases Billing Codes

Ali Soroush,
Benjamin S. Glicksberg,
Eyal Zimlichman
et al.

Abstract: Background: Large Language Models (LLMs) like GPT-3.5 and GPT-4 are increasingly entering the healthcare domain as a proposed means to assist with administrative tasks. To ensure safe and effective use with billing coding tasks, it is crucial to assess these models' ability to generate the correct International Classification of Diseases (ICD) codes from text descriptions. Objectives: We aimed to evaluate GPT-3.5 and GPT-4's capability to generate correct ICD billing codes, using the ICD-9-CM (2014) and ICD-10… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(1 citation statement)
references
References 26 publications
0
1
0
Order By: Relevance
“…Performance of these models for specialized healthcare-related tasks like assigning ICD codes to clinical notes remains subpar. Saroush, et al (32) demonstrated that prompting GPT-3.5 and − 4 via the ChatGPT interface by providing descriptions of the ICD-10 code predicted the correct ICD-10 codes only 10% (GPT-3.5) and 13% (GPT-4) of the time. Boyle, et al(28) observed similar results.…”
Section: Discussionmentioning
confidence: 99%
“…Performance of these models for specialized healthcare-related tasks like assigning ICD codes to clinical notes remains subpar. Saroush, et al (32) demonstrated that prompting GPT-3.5 and − 4 via the ChatGPT interface by providing descriptions of the ICD-10 code predicted the correct ICD-10 codes only 10% (GPT-3.5) and 13% (GPT-4) of the time. Boyle, et al(28) observed similar results.…”
Section: Discussionmentioning
confidence: 99%