Explainability methods for machine learning systems for multimodal medical datasets

Storås, Andrea M.; Strümke, Inga; Riegler, Michael; Halvorsen, Pål

doi:10.1145/3524273.3533925

Cited by 3 publications

(1 citation statement)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…” Including sociome datasets is often a burdensome data problem, both in finding and integrating disparate datasets, where clinical patient data have to be integrated with other data sources to characterize a patient’s life outside of their clinical interactions. We refer to the entirety of these non-clinical or social factors as a patient’s “sociome.” Due to the diversity of data sources and file types that sociome research has to consider, key bottlenecks in scaling such research to large patient populations include data integration [2], data harmonization [3], uneven data quality [4], and statistical modeling of multimodal datasets [5]. Consequently, studies often focus on one factor, a composite index, or a set of highly related factors [6], where potentially crucial nuances and interactions among factors can be lost.…”

Section: Introductionmentioning

confidence: 99%

Sociome Data Commons: A scalable and sustainable platform for investigating the full social context and determinants of health

Tilmon,

Nyenhuis,

Solomonides

et al. 2023

J. Clin. Trans. Sci.

View full text Add to dashboard Cite

Background/Objective: Non-clinical aspects of life, such as social, environmental, behavioral, psychological, and economic factors, what we call the sociome, play significant roles in shaping patient health and health outcomes. This paper introduces the Sociome Data Commons (SDC), a new research platform that enables large-scale data analysis for investigating such factors. Methods: This platform focuses on “hyper-local” data, i.e., at the neighborhood or point level, a geospatial scale of data not adequately considered in existing tools and projects. We enumerate key insights gained regarding data quality standards, data governance, and organizational structure for long-term project sustainability. A pilot use case investigating sociome factors associated with asthma exacerbations in children residing on the South Side of Chicago used machine learning and six SDC datasets. Results: The pilot use case reveals one dominant spatial cluster for asthma exacerbations and important roles of housing conditions and cost, proximity to Superfund pollution sites, urban flooding, violent crime, lack of insurance, and a poverty index. Conclusion: The SDC has been purposefully designed to support and encourage extension of the platform into new data sets as well as the continued development, refinement, and adoption of standards for dataset quality, dataset inclusion, metadata annotation, and data access/governance. The asthma pilot has served as the first driver use case and demonstrates promise for future investigation into the sociome and clinical outcomes. Additional projects will be selected, in part for their ability to exercise and grow the capacity of the SDC to meet its ambitious goals.

show abstract