2020
DOI: 10.48550/arxiv.2010.01149
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Evaluating Progress on Machine Learning for Longitudinal Electronic Healthcare Data

David Bellamy,
Leo Celi,
Andrew L. Beam

Abstract: The Large Scale Visual Recognition Challenge based on the well-known Imagenet dataset catalyzed an intense flurry of progress in computer vision. Benchmark tasks have propelled other sub-fields of machine learning forward at an equally impressive pace, but in healthcare it has primarily been image processing tasks, such as in dermatology and radiology, that have experienced similar benchmark-driven progress. In the present study, we performed a comprehensive review of benchmarks in medical machine learning for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 7 publications
0
7
0
Order By: Relevance
“…To facilitate autoregressive model training, we introduce causal masking on the SGU weight matrix, W , to prevent future time-steps from leaking information. This is efficiently done by zeroing all upper-triangular elements of the matrix before the multiplication in (6).…”
Section: B Components Of the Sansformer Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…To facilitate autoregressive model training, we introduce causal masking on the SGU weight matrix, W , to prevent future time-steps from leaking information. This is efficiently done by zeroing all upper-triangular elements of the matrix before the multiplication in (6).…”
Section: B Components Of the Sansformer Modelmentioning
confidence: 99%
“…In instances where EHRs encompass natural language input, such as discharge notes written by healthcare professionals, Transformers have achieved impressive results [69], [30]. Nonetheless, the unique structure and characteristics of EHR data, especially sequences of clinical codes, present challenges that can sometimes hinder Transformers from consistently outperforming simpler models with carefully engineered features [1], [6], [32]. While Transformers excel in many areas, their performance with EHR can be nuanced, often requiring extensive pretraining on large datasets.…”
Section: Introductionmentioning
confidence: 99%
“…3C right). This may be due to the fact that specific gene mutations often have an important impact on targeted therapy 27 , but mutation information is difficult to reveal from gene expression data; while chemotherapy drugs were widely reported to be related to gene expression 31,32 so their IC50 is easier to predict. Gene expression-based results had an overall lower PCC, and we did not observe a significant difference between drug types.…”
Section: Scfoundation Improves Cancer Drug Response Predictionmentioning
confidence: 99%
“…When under-powered baselines are chosen, the algorithm results may seem more favorable, creating an illusion of progress. This is illustrated in healthcare data [Bellamy et al, 2020], semi-supervised learning [Oliver et al, 2018], recommendation systems [Dacrema et al, 2019], metric learning [Musgrave et al, 2020] and deep learning more generally [Marcus, 2018] (and references therein).…”
Section: Incorrectly Chosen Baselinesmentioning
confidence: 99%