An investigation study on the interpretation of ultrasonic medical reports using <scp>OpenAI</scp>'s <scp>GPT</scp>‐3.5‐turbo model

Wang, Wen hui; Wang, Shi yu; Huang, Jia yan; Liu, Xiao di; Yang, Jie; Liao, Min; Lu, Qiang; Wu, Zhe

doi:10.1002/jcu.23590

Cited by 2 publications

(1 citation statement)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While some studies have shown the positive impact of AI chatbots, like ChatGPT, on learning outcomes and as supplementary tools in various fields, the fundamental question remains regarding the depth of their understanding of language and context comparable to human cognition [48]. The potential of ChatGPT to improve work efficiency, correct responses, and facilitate the communication of complex scientific findings to a broader audience has been recognized, underscoring its utility in different domains [49,50]. The ongoing discussions within the AI research community regarding the language understanding capabilities of large-scale pre-trained models like ChatGPT reflect the need for further exploration and evaluation of these models across various disciplines to determine the extent of their comprehension and application in real-world scenarios.…”

Section: When Ai Goes Awry: Understanding the Risks Of Unpredictable ...mentioning

confidence: 99%

The Era of Artificial Intelligence Deception: Unraveling the Complexities of False Realities and Emerging Threats of Misinformation

Williamson,

Prybutok

2024

Information

View full text Add to dashboard Cite

This study delves into the dual nature of artificial intelligence (AI), illuminating its transformative potential that has the power to revolutionize various aspects of our lives. We delve into critical issues such as AI hallucinations, misinformation, and unpredictable behavior, particularly in large language models (LLMs) and AI-powered chatbots. These technologies, while capable of manipulating human decisions and exploiting cognitive vulnerabilities, also hold the key to unlocking unprecedented opportunities for innovation and progress. Our research underscores the need for robust, ethical AI development and deployment frameworks, advocating a balance between technological advancement and societal values. We emphasize the importance of collaboration among researchers, developers, policymakers, and end users to steer AI development toward maximizing benefits while minimizing potential harms. This study highlights the critical role of responsible AI practices, including regular training, engagement, and the sharing of experiences among AI users, to mitigate risks and develop the best practices. We call for updated legal and regulatory frameworks to keep pace with AI advancements and ensure their alignment with ethical principles and societal values. By fostering open dialog, sharing knowledge, and prioritizing ethical considerations, we can harness AI’s transformative potential to drive human advancement while managing its inherent risks and challenges.

show abstract

Section: When Ai Goes Awry: Understanding the Risks Of Unpredictable ...mentioning

confidence: 99%

The Era of Artificial Intelligence Deception: Unraveling the Complexities of False Realities and Emerging Threats of Misinformation

Williamson,

Prybutok

2024

Information

View full text Add to dashboard Cite

show abstract

Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review

Du,

Wang,

Zhou

et al. 2024

Preprint

View full text Add to dashboard Cite

BackgroundGenerative Large language models (LLMs) represent a significant advancement in natural language processing, achieving state-of-the-art performance across various tasks. However, their application in clinical settings using real electronic health records (EHRs) is still rare and presents numerous challenges.ObjectiveThis study aims to systematically review the use of generative LLMs in patient care-related topics involving EHRs, summarize the challenges faced, and suggest future directions.MethodsA Boolean search for peer-reviewed articles was conducted in May 2024 using PubMed and Web of Science to include research articles published since 2023, which was one month after the release of ChatGPT. The search results were deduplicated. Multiple reviewers, including biomedical informaticians, computer scientists, and a physician, screened the publications for eligibility and extracted bibliometric and clinically relevant information. Only papers utilizing generative LLMs to analyze real EHR data were included. We summarized the use of prompt engineering, fine-tuning, multimodal EHR data, and evaluation matrices. Additionally, we identified current challenges in applying LLMs in clinical settings as reported by the included papers and proposed future directions.ResultsThe initial search identified 6,328 unique studies, with 76 studies included after eligibility screening. Of these, 67 studies (88.2%) employed zero-shot prompting, five of them reported 100% accuracy on five specific clinical tasks. Nine studies used advanced prompting strategies; four tested these strategies experimentally, finding that prompt engineering improved performance, with one study noting a non-linear relationship between the number of examples in a prompt and performance improvement. Eight studies explored fine-tuning generative LLMs, all reported performance improvements on specific tasks, but three of them noted potential performance degradation after fine-tuning on certain tasks. Only two studies utilized multimodal data, which improved LLM-based decision-making and enabled accurate rare disease diagnosis and prognosis. The studies employed 55 different evaluation metrics for 22 purposes, such as correctness, completeness, and conciseness. Two studies investigated LLM bias, with one detecting no bias and the other finding that male patients received more appropriate clinical decision-making suggestions. Six studies identified hallucinations, such as fabricating patient names in structured thyroid ultrasound reports. Additional challenges included but not limited to the impersonal tone of LLM consultations, which made patients uncomfortable, and the difficulty patients had in understanding LLM responses.ConclusionOur review indicates that few studies have employed advanced computational techniques to enhance LLM performance. The diverse evaluation metrics used highlight the need for standardization. LLMs currently cannot replace physicians due to challenges such as bias, hallucinations, and impersonal responses.

show abstract

An investigation study on the interpretation of ultrasonic medical reports using OpenAI's GPT‐3.5‐turbo model

Cited by 2 publications

References 12 publications

The Era of Artificial Intelligence Deception: Unraveling the Complexities of False Realities and Emerging Threats of Misinformation

The Era of Artificial Intelligence Deception: Unraveling the Complexities of False Realities and Emerging Threats of Misinformation

Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review

Contact Info

Product

Resources

About