2017
DOI: 10.1186/s12911-017-0429-1
|View full text |Cite
|
Sign up to set email alerts
|

Automatic identification of variables in epidemiological datasets using logic regression

Abstract: BackgroundFor an individual participant data (IPD) meta-analysis, multiple datasets must be transformed in a consistent format, e.g. using uniform variable names. When large numbers of datasets have to be processed, this can be a time-consuming and error-prone task. Automated or semi-automated identification of variables can help to reduce the workload and improve the data quality. For semi-automation high sensitivity in the recognition of matching variables is particularly important, because it allows creatin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…However, none of these studies implemented our data transform method to boost linear or polynomial regression models. Since the last decade, there have been several attempts in the existing peer-reviewed literature to implement linear models as well as other machine-learning methods in combination with the data transform function, including logistic regression, regression trees and Fourier transform, logistic regression with Log10 transformation, logistic regression with Ln transformation, multiple linear regression with log10 transformation, cycling regression model with Fourier transform, proportional hazards Cox regression model, time-series analytics regression with Fourier transform, logistic regression with square root and log10 transformation, and proportional hazards model in combination with logistic regression (Lorenz et al, 2017;Menotti, Puddu, & Lanti, 2002;Shaban-Nejad, Michalowski, & Buckeridge, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…However, none of these studies implemented our data transform method to boost linear or polynomial regression models. Since the last decade, there have been several attempts in the existing peer-reviewed literature to implement linear models as well as other machine-learning methods in combination with the data transform function, including logistic regression, regression trees and Fourier transform, logistic regression with Log10 transformation, logistic regression with Ln transformation, multiple linear regression with log10 transformation, cycling regression model with Fourier transform, proportional hazards Cox regression model, time-series analytics regression with Fourier transform, logistic regression with square root and log10 transformation, and proportional hazards model in combination with logistic regression (Lorenz et al, 2017;Menotti, Puddu, & Lanti, 2002;Shaban-Nejad, Michalowski, & Buckeridge, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…Logic regression estimates a decision tree constructed using Boolean combinations of binary predictors. Logic regression has been utilized extensively in genetic association studies for identifying high-dimensional interactions [30,31], and has recently been extended to other exposures [32][33][34]. We show how logic regression provides a data-driven approach to construct a daily extreme heat exposure indicator that is binary (i.e., presence versus absence of the exposure) and can capture impacts of sustained extreme exposure over several days (i.e., heat waves).…”
Section: Introductionmentioning
confidence: 99%