2024
DOI: 10.2196/54704
|View full text |Cite
|
Sign up to set email alerts
|

A Preliminary Checklist (METRICS) to Standardize the Design and Reporting of Studies on Generative Artificial Intelligence–Based Models in Health Care Education and Practice: Development Study Involving a Literature Review

Malik Sallam,
Muna Barakat,
Mohammed Sallam

Abstract: Background Adherence to evidence-based practice is indispensable in health care. Recently, the utility of generative artificial intelligence (AI) models in health care has been evaluated extensively. However, the lack of consensus guidelines on the design and reporting of findings of these studies poses a challenge for the interpretation and synthesis of evidence. Objective This study aimed to develop a preliminary checklist to standardize the reporting… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(4 citation statements)
references
References 100 publications
0
4
0
Order By: Relevance
“…The study utilized the METRICS checklist for the design and reporting of generative AI studies in healthcare [32]. The basis of the study was a randomly selected 40 Virology MCQs, used for testing of medical students during the period 2017-2022.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The study utilized the METRICS checklist for the design and reporting of generative AI studies in healthcare [32]. The basis of the study was a randomly selected 40 Virology MCQs, used for testing of medical students during the period 2017-2022.…”
Section: Methodsmentioning
confidence: 99%
“…Recent studies investigated the capability of different AI models to pass exams in different domains, with a wide variability in performance as reviewed recently by Newton and Xiromeriti [31]. This variability can be attributed to different factors, such as the AI model used, the prompting approach, and importantly the language(s) used in prompting [31,32]. Such ndings highlight the necessity for continued research to elucidate the determinants of AI models' performance, thereby informing the re nement of AI algorithms for improved performance and subsequent improved utility in various disciplines such as healthcare education [7,33,34].…”
Section: Introductionmentioning
confidence: 99%
“…However, the integration of AI into clinical practice is not without challenges. One of the primary concerns is the need for standardized reporting algorithms for AI-generated information [9].…”
Section: Breast Cancer Detection and Diagnosismentioning
confidence: 99%
“…Timing/Transparency, Range/Randomization, Individual Factors, Count, Specificity of the prompts/language) checklist for standardization of design and reporting AI-based studies in healthcare [14]. As the study did not involve the direct participation of human subjects and was primarily focused on interactions with conversational AI systems, formal ethical approval was not sought or required.…”
Section: Introductionmentioning
confidence: 99%