2021
DOI: 10.1111/1756-185x.14203
|View full text |Cite
|
Sign up to set email alerts
|

Handling missing data in a rheumatoid arthritis registry using random forest approach

Abstract: In any clinical research, missing values or experimental values remain a problem in correctly analyzing results and in obtaining inaccurate outcomes. These missing values often lead to misinterpretation and biased results, which could ultimately affect the overall conclusion of an investigation. [1][2][3][4] The application of statistical analyses in experiments with missing values poses serious problems, as the missing values are often automatically ignored by the statistical algorithms.The results obtained b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 19 publications
(8 citation statements)
references
References 47 publications
0
8
0
Order By: Relevance
“…Missing data were imputed by the missForest (sequential random forest) multiple imputation method using the “missForest” package of R software (version 4.1.2). This method has been shown to produce the lowest imputation error for continuous and categorical variables [25] .…”
Section: Methodsmentioning
confidence: 99%
“…Missing data were imputed by the missForest (sequential random forest) multiple imputation method using the “missForest” package of R software (version 4.1.2). This method has been shown to produce the lowest imputation error for continuous and categorical variables [25] .…”
Section: Methodsmentioning
confidence: 99%
“…We used the missForest R package (hyperparameters: maxiter = 10, ntree = 1000, verbose = TRUE) to interpolate variables with a few missing data. The method is based on using the missing data problem as a model prediction, and each variable in turn is predicted using a fitted Random Forest (RF) regression model to predict the missing data for the dependent variable, and the study confirms that this method outperforms non-algorithm-based interpolation methods [ 29 , 30 ].…”
Section: Methodsmentioning
confidence: 93%
“…Although the use of multiple imputation to address missing data in medical research and clinical trials has gained traction (42,43), these methods have yet to become standard practice in epidemiologic studies in pediatric rheumatology. When looking at prior CARRA Registry research studies, we found that while many reported missing data, most used complete case analysis methods (7,8,44), and only a few CARRA Registry studies conducted sensitivity analyses or used multiple imputation to address missing data and to assess its impact on study results (11,20).…”
Section: Discussionmentioning
confidence: 99%