A dataset of simulated patient-physician medical interviews with a focus on respiratory cases

Fareez, Faiha; Parikh, Tishya; Wavell, Christopher; Shahab, Saba; Chevalier, Meghan; Good, Scott; Blasi, Isabella De; Rhouma, Rafik; McMahon, Christopher; Lam, Jean–Paul; Lo, Thomas; Smith, Chris Selby

doi:10.1038/s41597-022-01423-1

Cited by 10 publications

(2 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Concerning transcripts of simulated exams, Mistica (2008) explored the use of discourse analytics features in the automated scoring of two selected cases. Fareez (2022) published a collection of simulated patient encounter transcripts for respiratory cases to facilitate ASR methods specific to this domain.…”

Section: Prior Work – Simulation Assessmentmentioning

confidence: 99%

Zero-Shot Multimodal Question Answering for Assessment of Medical Student OSCE Physical Exam Videos

Holcomb,

Kang,

Shakur

et al. 2024

Preprint

View full text Add to dashboard Cite

The Objective Structured Clinical Examination (OSCE) is a critical component of medical education whereby the data gathering, clinical reasoning, physical examination, diagnostic and planning capabilities of medical students are assessed in a simulated outpatient clinical setting with standardized patient actors (SPs) playing the role of patients with a predetermined diagnosis, or case. This study is the first to explore the zero-shot automation of physical exam grading in OSCEs by applying multimodal question answering techniques to the analysis of audiovisual recordings of simulated medical student encounters. Employing a combination of large multimodal models (LLaVA-1.6 7B,13B,34B, GPT-4V, and GPT-4o), automatic speech recognition (Whisper v3), and large language models (LLMs), we assess the feasibility of applying these component systems to the domain of student evaluation without any retraining. Our approach converts video content into textual representations, encompassing the transcripts of the audio component and structured descriptions of selected video frames generated by the multimodal model. These representations, referred to as “exam stories,” are then used as context for an abstractive question-answering problem via an LLM. A collection of 191 audiovisual recordings of medical student encounters with an SP for a single OSCE case was used as a test bed for exploring relevant features of successful exams. During this case, the students should have performed three physical exams: 1) mouth exam, 2) ear exam, and 3) nose exam. These examinations were each scored by two trained, non-faculty standardized patient evaluators (SPE) using the audiovisual recordings—an experienced, non-faculty SPE adjudicated disagreements. The percentage agreement between the described methods and the SPEs’ determination of exam occurrence as measured by percentage agreement varied from 26% to 83%. The audio-only methods, which relied exclusively on the transcript for exam recognition, performed uniformly higher by this measure compared to both the image-only methods and the combined methods across differing model sizes. The outperformance of the transcript-only model was strongly linked to the presence of key phrases where the student-physician would “signpost” the progression of the physical exam for the standardized patient, either alerting when they were about to begin an examination or giving the patient instructions. Multimodal models offer tremendous opportunity for improving the workflow of the physical examinations’ evaluation, for example by saving time and guiding focus for better assessment. While these models offer the promise of unlocking audiovisual data for downstream analysis with natural language processing methods, our findings reveal a gap between the off-the-shelf AI capabilities of many available models and the nuanced requirements of clinical practice, highlighting a need for further development and enhanced evaluation protocols in this area. We are actively pursuing a variety of approaches to realize this vision.

show abstract

Section: Prior Work – Simulation Assessmentmentioning

confidence: 99%

Zero-Shot Multimodal Question Answering for Assessment of Medical Student OSCE Physical Exam Videos

Holcomb,

Kang,

Shakur

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

“…b) can the translation error pose a clinical harm? (Fareez et al, 2022). We identify thresholds for QE scores on this development set based on the ROC curves for adequacy and risk prediction.…”

Section: Treatment Conditions Detailsmentioning

confidence: 99%

Physician Detection of Clinical Harm in Machine Translation: Quality Estimation Aids in Reliance and Backtranslation Identifies Critical Errors

Mehandru,

Agrawal,

Xiao

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

A major challenge in the practical use of Machine Translation (MT) is that users lack guidance to make informed decisions about when to rely on outputs. Progress in quality estimation research provides techniques to automatically assess MT quality, but these techniques have primarily been evaluated in vitro by comparison against human judgments outside of a specific context of use. This paper evaluates quality estimation feedback in vivo with a human study simulating decision-making in high-stakes medical settings. Using Emergency Department discharge instructions, we study how interventions based on quality estimation versus backtranslation assist physicians in deciding whether to show MT outputs to a patient. We find that quality estimation improves appropriate reliance on MT, but backtranslation helps physicians detect more clinically harmful errors that QE alone often misses.

show abstract

AI-Based Medical Scribe to Support Clinical Consultations: A Proposed System Architecture

Montenegro,

Gomes,

Machado

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

A dataset of simulated patient-physician medical interviews with a focus on respiratory cases

Cited by 10 publications

References 22 publications

Zero-Shot Multimodal Question Answering for Assessment of Medical Student OSCE Physical Exam Videos

Zero-Shot Multimodal Question Answering for Assessment of Medical Student OSCE Physical Exam Videos

Physician Detection of Clinical Harm in Machine Translation: Quality Estimation Aids in Reliance and Backtranslation Identifies Critical Errors

AI-Based Medical Scribe to Support Clinical Consultations: A Proposed System Architecture

Contact Info

Product

Resources

About