Background One key aspect of a learning health system (LHS) is utilizing data generated during care delivery to inform clinical care. However, institutional guidelines that utilize observational data are rare and require months to create, making current processes impractical for more urgent scenarios such as those posed by the COVID-19 pandemic. There exists a need to rapidly analyze institutional data to drive guideline creation where evidence from randomized control trials are unavailable. Objectives This article provides a background on the current state of observational data generation in institutional guideline creation and details our institution's experience in creating a novel workflow to (1) demonstrate the value of such a workflow, (2) demonstrate a real-world example, and (3) discuss difficulties encountered and future directions. Methods Utilizing a multidisciplinary team of database specialists, clinicians, and informaticists, we created a workflow for identifying and translating a clinical need into a queryable format in our clinical data warehouse, creating data summaries and feeding this information back into clinical guideline creation. Results Clinical questions posed by the hospital medicine division were answered in a rapid time frame and informed creation of institutional guidelines for the care of patients with COVID-19. The cost of setting up a workflow, answering the questions, and producing data summaries required around 300 hours of effort and $300,000 USD. Conclusion A key component of an LHS is the ability to learn from data generated during care delivery. There are rare examples in the literature and we demonstrate one such example along with proposed thoughts of ideal multidisciplinary team formation and deployment.
Objectives To analyze current practices in shared decision-making (SDM) in primary care and perform a needs assessment for the role of information technology (IT) interventions. Materials and Methods A mixed-methods study was conducted in three phases: (1) ethnographic observation of clinical encounters, (2) patient interviews, and (3) physician interviews. SDM was measured using the validated OPTION scale. Semistructured interviews followed an interview guide (developed by our multidisciplinary team) informed by the Traditional Decision Conflict Scale and Shared Decision Making Questionnaire. Field notes were independently coded and analyzed by two reviewers in Dedoose. Results Twenty-four patient encounters were observed in 3 diverse practices with an average OPTION score of 57.2 (0–100 scale; 95% confidence interval [CI], 51.8–62.6). Twenty-two patient and 8 physician interviews were conducted until thematic saturation was achieved. Cohen’s kappa, measuring coder agreement, was 0.42. Patient domains were: establishing trust, influence of others, flexibility, frustrations, values, and preferences. Physician domains included frustrations, technology (concerns, existing use, and desires), and decision making (current methods used, challenges, and patients’ understanding). Discussion Given low SDM observed, multiple opportunities for technology to enhance SDM exist based on specific OPTION items that received lower scores, including: (1) checking the patient’s preferred information format, (2) asking the patient's preferred level of involvement in decision making, and (3) providing an opportunity for deferring a decision. Based on data from interviews, patients and physicians value information exchange and are open to technologies that enhance communication of care options. Conclusion Future primary care IT platforms should prioritize the 3 quantitative gaps identified to improve physician–patient communication and relationships. Additionally, SDM tools should seek to standardize common workflow steps across decisions and focus on barriers to increasing adoption of effective SDM tools into routine primary care.
Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we conducted a reliability audit of two models based on model performance and calibration as well as a fairness audit based on summary statistics, subgroup performance and subgroup calibration. We assessed the Epic End-of-Life (EOL) Index model and an internally developed Stanford Hospital Medicine (HM) Advance Care Planning (ACP) model in 3 practice settings: Primary Care, Inpatient Oncology and Hospital Medicine, using clinicians' answers to the surprise question as a surrogate outcome. For performance, the models had positive predictive value (PPV) at or above 0.76 in all settings. In Hospital Medicine and Inpatient Oncology, the Stanford HM ACP model had higher sensitivity (0.69, 0.89 respectively) than the EOL model (0.20, 0.27), and better calibration (O/E 1.5, 1.7) than the EOL model (O/E 2.5, 3.0). The Epic EOL model flagged fewer patients (11%, 21% respectively) than the Stanford HM ACP model (38%, 75%). There were no differences in performance and calibration by sex. Both models had lower sensitivity in Hispanic/Latino male patients with Race listed as "Other." In a follow-up survey, 10/10 clinicians reported that the summary statistics, overall performance, and subgroup performance would affect their decision to use the model to guide care, and 9/10 said the same for overall and subgroup calibration. The most commonly identified barriers for routinely conducting such reliability and fairness audits were poor demographic data quality and lack of data access. This audit required 115 person-hours across 8-10 months. Our recommendations for performing reliability and fairness audits before model deployment include verifying data validity, analyzing model performance on intersectional subgroups, and collecting clinician-patient linkages as necessary for manual chart review and label generation. Those responsible for AI models should require such audits before model deployment and mediate between model auditors and impacted stakeholders.
Multiple reporting guidelines for artificial intelligence (AI) models in healthcare recommend that models be audited for reliability and fairness. However, there is a gap of operational guidance for performing reliability and fairness audits in practice. Following guideline recommendations, we conducted a reliability audit of two models based on model performance and calibration as well as a fairness audit based on summary statistics, subgroup performance and subgroup calibration. We assessed the Epic End-of-Life (EOL) Index model and an internally developed Stanford Hospital Medicine (HM) Advance Care Planning (ACP) model in 3 practice settings: Primary Care, Inpatient Oncology and Hospital Medicine, using clinicians' answers to the surprise question (“Would you be surprised if [patient X] passed away in [Y years]?”) as a surrogate outcome. For performance, the models had positive predictive value (PPV) at or above 0.76 in all settings. In Hospital Medicine and Inpatient Oncology, the Stanford HM ACP model had higher sensitivity (0.69, 0.89 respectively) than the EOL model (0.20, 0.27), and better calibration (O/E 1.5, 1.7) than the EOL model (O/E 2.5, 3.0). The Epic EOL model flagged fewer patients (11%, 21% respectively) than the Stanford HM ACP model (38%, 75%). There were no differences in performance and calibration by sex. Both models had lower sensitivity in Hispanic/Latino male patients with Race listed as “Other.” 10 clinicians were surveyed after a presentation summarizing the audit. 10/10 reported that summary statistics, overall performance, and subgroup performance would affect their decision to use the model to guide care; 9/10 said the same for overall and subgroup calibration. The most commonly identified barriers for routinely conducting such reliability and fairness audits were poor demographic data quality and lack of data access. This audit required 115 person-hours across 8–10 months. Our recommendations for performing reliability and fairness audits include verifying data validity, analyzing model performance on intersectional subgroups, and collecting clinician-patient linkages as necessary for label generation by clinicians. Those responsible for AI models should require such audits before model deployment and mediate between model auditors and impacted stakeholders.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.