2024
DOI: 10.1038/s41746-024-01083-y
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating large language models as agents in the clinic

Nikita Mehandru,
Brenda Y. Miao,
Eduardo Rodriguez Almaraz
et al.

Abstract: Recent developments in large language models (LLMs) have unlocked opportunities for healthcare, from information synthesis to clinical decision support. These LLMs are not just capable of modeling language, but can also act as intelligent “agents” that interact with stakeholders in open-ended conversations and even influence clinical decision-making. Rather than relying on benchmarks that measure a model’s ability to process clinical data or answer standardized test questions, LLM agents can be modeled in high… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(4 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…A significant focus has been on LLM applications for medical question answering and clinical reasoning, as well as agentic AI that assists clinicians with workflow tasks, such as summarising notes and medical documentation. [19][20][21] While VLMs using Gemini have been tested for interpreting OCT pathology, their reliability remains unproven. 22 In contrast, LVMs like…”
Section: History and Overview Of Generative Modelsmentioning
confidence: 99%
“…A significant focus has been on LLM applications for medical question answering and clinical reasoning, as well as agentic AI that assists clinicians with workflow tasks, such as summarising notes and medical documentation. [19][20][21] While VLMs using Gemini have been tested for interpreting OCT pathology, their reliability remains unproven. 22 In contrast, LVMs like…”
Section: History and Overview Of Generative Modelsmentioning
confidence: 99%
“…It is necessary to carry out more accurate assessment of LLMs’ functionality in real-world clinical scenarios. [4]…”
Section: Introductionmentioning
confidence: 99%
“…It is necessary to carry out more accurate assessment of LLMs' functionality in real-world clinical scenarios. [4] Recent studies highlight the escalating global diabetes burden, advocating for innovative management strategies like the application of large language models (LLMs), including GPT-4. These models offer promising advances in diabetes care by potentially enhancing guideline adherence and providing personalized, evidencebased treatment recommendations.…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, we evaluate the influence of temperature settings on the clinical reasoning capabilities of various LLMs when tasked with interpreting clinical data. This investigation serves to deepen our understanding of LLM functionality in healthcare environments [10][11] [12].…”
Section: Introductionmentioning
confidence: 99%