2011
DOI: 10.1038/sj.bjc.6606078
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of methods for handling missing data on immunohistochemical markers in survival analysis of breast cancer

Abstract: Background:Tissue micro-arrays (TMAs) are increasingly used to generate data of the molecular phenotype of tumours in clinical epidemiology studies, such as studies of disease prognosis. However, TMA data are particularly prone to missingness. A variety of methods to deal with missing data are available. However, the validity of the various approaches is dependent on the structure of the missing data and there are few empirical studies dealing with missing data from molecular pathology. The purpose of this stu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
56
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 64 publications
(57 citation statements)
references
References 24 publications
1
56
0
Order By: Relevance
“…Consequently, the observed subtype-specific incidence rates would underestimate the true rates, and underestimation would be greater for women born before 1929, because their tumour subtype was more likely to be unknown. To compensate for this, we performed multiple imputations to predict the molecular subtype of these tumours (24,26), assuming samples were missing at random (27) In analyses of prognosis, we distinguished between women diagnosed before 1995 and women diagnosed in 1995 or later, to approximate the gradual implementation of adjuvant treatment (including effective chemotherapy, anti-hormonal treatment and trastuzumab) in Norway (28). For each subtype, we calculated cumulative incidence of death from breast cancer at 5 and 15 years after diagnosis, treating deaths from other causes as competing events.…”
Section: Statistical Analysesmentioning
confidence: 99%
“…Consequently, the observed subtype-specific incidence rates would underestimate the true rates, and underestimation would be greater for women born before 1929, because their tumour subtype was more likely to be unknown. To compensate for this, we performed multiple imputations to predict the molecular subtype of these tumours (24,26), assuming samples were missing at random (27) In analyses of prognosis, we distinguished between women diagnosed before 1995 and women diagnosed in 1995 or later, to approximate the gradual implementation of adjuvant treatment (including effective chemotherapy, anti-hormonal treatment and trastuzumab) in Norway (28). For each subtype, we calculated cumulative incidence of death from breast cancer at 5 and 15 years after diagnosis, treating deaths from other causes as competing events.…”
Section: Statistical Analysesmentioning
confidence: 99%
“…However, several other studies have compared handling missing data methods for other types of variables across diverse data sources including the NIS, with the consensus being that differences in the performance of each method became more pronounced as the level of missingness increased and that complete case analysis generally performed the worst in terms of RMSE and bias [8,15,[18][19][20][21]. Langkamp et al [15] compared four methods (complete case analysis, reweighting techniques, hot deck method, and multiple imputation with 'ICE' (Imputation by Chained Equations) to address missing data in three dichotomous variables in the National Maternal and Infant Health Survey linked with the Longitudinal Follow-Up Live Birth Survey.…”
Section: Resultsmentioning
confidence: 99%
“…Possible solutions to address this would be to assess whether missing information on biomarkers of interest is random or nonrandom by comparing characteristics or survival of patients who had information against those with missing information [2], or restricting study to the part of the cohort where biomarkers were routinely determined [5]. Investigators may also use statistical methods, such as imputation that have been found to yield more precise [10] and valid [11,12] results compared with complete case analysis.…”
Section: Discussionmentioning
confidence: 97%