ObjectivesTo compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of eight popular symptom assessment apps.DesignVignettes study.Setting200 primary care vignettes.Intervention/comparatorFor eight apps and seven general practitioners (GPs): breadth of coverage and condition-suggestion and urgency advice accuracy measured against the vignettes’ gold-standard.Primary outcome measures(1) Proportion of conditions ‘covered’ by an app, that is, not excluded because the user was too young/old or pregnant, or not modelled; (2) proportion of vignettes with the correct primary diagnosis among the top 3 conditions suggested; (3) proportion of ‘safe’ urgency advice (ie, at gold standard level, more conservative, or no more than one level less conservative).ResultsCondition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%; WebMD: 93.0%. Top-3 suggestion accuracy was GPs (average): 82.1%±5.2%; Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%; Your.MD: 23.5%. Some apps excluded certain user demographics or conditions and their performance was generally greater with the exclusion of corresponding vignettes. For safe urgency advice, tested GPs had an average of 97.0%±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 SD of the GPs—Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 SDs of GPs—Your.MD: 92.6%. Three apps had a safety performance outside 2 SDs of GPs—Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3×10-3).ConclusionsThe utility of digital symptom assessment apps relies on coverage, accuracy and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.
Background: Rare disease diagnosis is often delayed by years. A primary factor for this delay is a lack of knowledge and awareness regarding rare diseases. Probabilistic diagnostic decision support systems (DDSSs) have the potential to accelerate rare disease diagnosis by suggesting differential diagnoses for physicians based on case input and incorporated medical knowledge. We examine the DDSS prototype Ada DX and assess its potential to provide accurate rare disease suggestions early in the course of rare disease cases. Results: Ada DX suggested the correct disease earlier than the time of clinical diagnosis among the top five fit disease suggestions in 53.8% of cases (50 of 93), and as the top fit disease suggestion in 37.6% of cases (35 of 93). The median advantage of correct disease suggestions compared to the time of clinical diagnosis was 3 months or 50% for top five fit and 1 month or 21% for top fit. The correct diagnosis was suggested at the first documented patient visit in 33.3% (top 5 fit), and 16.1% of cases (top fit), respectively. Wilcoxon signed-rank test shows a significant difference between the time to clinical diagnosis and the time to correct disease suggestion for both top five fit and top fit (z-score-6.68, respective-5.71, α=0.05, p-value <0.001). Conclusion: Ada DX provided accurate rare disease suggestions in most rare disease cases. In many cases, Ada DX provided correct rare disease suggestions early in the course of the disease, sometimes at the very beginning of a patient journey. The interpretation of these results indicates that Ada DX has the potential to suggest rare diseases to physicians early in the course of a case. Limitations of this study derive from its retrospective and unblinded design, data input by a single user, and the optimization of the knowledge base during the course of the study. Results pertaining to the system's accuracy should be interpreted cautiously. Whether the use of Ada DX reduces the time to diagnosis in rare diseases in a clinical setting should be validated in prospective studies.
Objectives To compare breadth of condition coverage, accuracy of suggested conditions and appropriateness of urgency advice of 8 popular symptom assessment apps with each other and with 7 General Practitioners.Design Clinical vignettes study.Setting 200 clinical vignettes representing real-world scenarios in primary care.Intervention/comparator Condition coverage, suggested condition accuracy, and urgency advice performance was measured against the vignettes' gold-standard diagnoses and triage level. Primary outcome measuresOutcomes included (i) proportion of conditions "covered" by an app, i.e. not excluded because the patient was too young/old, pregnant, or comorbid, (ii) proportion of vignettes in which the correct primary diagnosis was amongst the top 3 conditions suggested, : medRxiv preprint and, (iii) proportion of "safe" urgency level advice (i.e. at gold standard level, more conservative, or no more than one level less conservative).Results Condition-suggestion coverage was highly variable, with some apps not offering a suggestion for many users: in alphabetical order, Ada: 99.0%; Babylon: 51.5%; Buoy: 88.5%; K Health: 74.5%; Mediktor: 80.5%; Symptomate: 61.5%; Your.MD: 64.5%. The top-3 suggestion accuracy (M3) of GPs was on average 82.1±5.2%. For the apps it was -Ada: 70.5%; Babylon: 32.0%; Buoy: 43.0%; K Health: 36.0%; Mediktor: 36.0%; Symptomate: 27.5%; WebMD: 35.5%;Your.MD: 23.5%. Some apps exclude certain user groups (e.g. younger users) or certain conditions -for these apps condition-suggestion performance is generally greater with exclusion of these vignettes. For safe urgency advice, tested GPs had an average of 97.0±2.5%. For the vignettes with advice provided, only three apps had safety performance within 1 S.D. of the GPs (mean) -Ada: 97.0%; Babylon: 95.1%; Symptomate: 97.8%. One app had a safety performance within 2 S.D.s of GPs -Your.MD: 92.6%. Three apps had a safety performance outside 2 S.D.s of GPs -Buoy: 80.0% (p<0.001); K Health: 81.3% (p<0.001); Mediktor: 87.3% (p=1.3⨉10-3). ConclusionsThe utility of digital symptom assessment apps relies upon coverage, accuracy, and safety. While no digital tool outperformed GPs, some came close, and the nature of iterative improvements to software offers scalable improvements to care.
Background Continuously growing medical knowledge and the increasing amount of data make it difficult for medical professionals to keep track of all new information and to place it in the context of existing information. A variety of digital technologies and artificial intelligence–based methods are currently available as persuasive tools to empower physicians in clinical decision-making and improve health care quality. A novel diagnostic decision support system (DDSS) prototype developed by Ada Health GmbH with a focus on traceability, transparency, and usability will be examined more closely in this study. Objective The aim of this study is to test the feasibility and functionality of a novel DDSS prototype, exploring its potential and performance in identifying the underlying cause of acute dyspnea in patients at the University Hospital Basel. Methods A prospective, observational feasibility study was conducted at the emergency department (ED) and internal medicine ward of the University Hospital Basel, Switzerland. A convenience sample of 20 adult patients admitted to the ED with dyspnea as the chief complaint and a high probability of inpatient admission was selected. A study physician followed the patients admitted to the ED throughout the hospitalization without interfering with the routine clinical work. Routinely collected health-related personal data from these patients were entered into the DDSS prototype. The DDSS prototype’s resulting disease probability list was compared with the gold-standard main diagnosis provided by the treating physician. Results The DDSS presented information with high clarity and had a user-friendly, novel, and transparent interface. The DDSS prototype was not perfectly suited for the ED as case entry was time-consuming (1.5-2 hours per case). It provided accurate decision support in the clinical inpatient setting (average of cases in which the correct diagnosis was the first diagnosis listed: 6/20, 30%, SD 2.10%; average of cases in which the correct diagnosis was listed as one of the top 3: 11/20, 55%, SD 2.39%; average of cases in which the correct diagnosis was listed as one of the top 5: 14/20, 70%, SD 2.26%) in patients with dyspnea as the main presenting complaint. Conclusions The study of the feasibility and functionality of the tool was successful, with some limitations. Used in the right place, the DDSS has the potential to support physicians in their decision-making process by showing new pathways and unintentionally ignored diagnoses. The DDSS prototype had some limitations regarding the process of data input, diagnostic accuracy, and completeness of the integrated medical knowledge. The results of this study provide a basis for the tool’s further development. In addition, future studies should be conducted with the aim to overcome the current limitations of the tool and study design. Trial Registration ClinicalTrials.gov NCT04827342; https://clinicaltrials.gov/ct2/show/NCT04827342
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.