Batman and Robin in Healthcare Knowledge Work: Human-AI Collaboration by Clinical Documentation Integrity Specialists

Bossen, Claus; Pine, Kathleen H.

doi:10.1145/3569892

Cited by 19 publications

(8 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[60] However, the collaboration between HCP and AI is key to success in improving the accuracy, consistency, and completeness of medical documentation while minimizing documentation errors. [51,61] It is also important to develop operationalization and implementation plans with accountable, fair, and inclusive AI approaches to ensure the trustworthiness of the digital scribes. [62,63]…”

Section: Discussionmentioning

confidence: 99%

Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden

Sezgin,

Sirrianni,

Kranz

2023

Preprint

View full text Add to dashboard Cite

ObjectiveWe present a proof-of-concept digital scribe system as an ED clinical conversation summarization pipeline and report its performance.Materials and MethodsWe use four pre-trained large language models to establish the digital scribe system: T5-small, T5-base, PEGASUS-PubMed, and BART-Large-CNN via zero-shot and fine-tuning approaches. Our dataset includes 100 referral conversations among ED clinicians and medical records. We report the ROUGE-1, ROUGE-2, and ROUGE-L to compare model performance. In addition, we annotated transcriptions to assess the quality of generated summaries.ResultsThe fine-tuned BART-Large-CNN model demonstrates greater performance in summarization tasks with the highest ROUGE scores (F1ROUGE-1=0.49, F1ROUGE-2=0.23, F1ROUGE-L=0.35) scores. In contrast, PEGASUS-PubMed lags notably (F1ROUGE-1=0.28, F1ROUGE-2=0.11, F1ROUGE-L=0.22). BART-Large-CNN’s performance decreases by more than 50% with the zero-shot approach. Annotations show that BART-Large-CNN performs 71.4% recall in identifying key information and a 67.7% accuracy rate.DiscussionThe BART-Large-CNN model demonstrates a high level of understanding of clinical dialogue structure, indicated by its performance with and without fine-tuning. Despite some instances of high recall, there is variability in the model’s performance, particularly in achieving consistent correctness, suggesting room for refinement. The model’s recall ability varies across different information categories.ConclusionThe study provides evidence towards the potential of AI-assisted tools in reducing clinical documentation burden. Future work is suggested on expanding the research scope with larger language models, and comparative analysis to measure documentation efforts and time.

show abstract

Section: Discussionmentioning

confidence: 99%

Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden

Sezgin,

Sirrianni,

Kranz

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…For example, summarizing prior patient reports requires relatively lower expertise -a medical student level task-where having a 'good enough' summary could be still more useful than no summary. As argued elsewhere [19,62,119], this suggests a focus in AI development on simpler, more standardized use cases as a potentially lower risk and more responsible approach when starting to introduce AI innovations into clinical practice.…”

Section: Implications For Designing Ai In Healthcarementioning

confidence: 91%

Multimodal Healthcare AI: Identifying and Designing Clinically Relevant Vision-Language Applications for Radiology

Yildirim,

Richardson,

Wetscherek

et al. 2024

Proceedings of the CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

“…To summarize, we used: Starting our project, one of our goals was to identify low hanging fruit -situations where simple AI interventions could improve clinical work. Based on prior research highlighting the value 'imperfect AI' can bring [12,73] as well as our own work, we focused on AI model performance to sensitize our team to situations where moderate model performance can still bring enough value. Additionally, we repeatedly probed team members to think of simpler versions of concepts.…”

Section: Moving From Ideation To Prototypingmentioning

confidence: 99%

Sketching AI Concepts with Capabilities and Examples: AI Innovation in the Intensive Care Unit

Yildirim,

Zlotnikov,

Sayar

et al. 2024

Proceedings of the CHI Conference on Human Factors in Computing Systems

View full text Add to dashboard Cite

Batman and Robin in Healthcare Knowledge Work: Human-AI Collaboration by Clinical Documentation Integrity Specialists

Cited by 19 publications

References 36 publications

Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden

Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden

Multimodal Healthcare AI: Identifying and Designing Clinically Relevant Vision-Language Applications for Radiology

Sketching AI Concepts with Capabilities and Examples: AI Innovation in the Intensive Care Unit

Contact Info

Product

Resources

About