1999
DOI: 10.28945/599
|View full text |Cite
|
Sign up to set email alerts
|

Data Quality in Linear Regression Models: Effect of Errors in Test Data and Errors in Training Data on Predictive Accuracy

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2003
2003
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 41 publications
0
6
0
Order By: Relevance
“…Similarly, the GPM and SMAP input datasets exhibit their own sources of spatiotemporal uncertainties. However, errors in the training data are demonstrated to favor the generalization of regression models and improve their corrective performance against an output target [61]. This is particularly expected from the more robust ANN architecture which is capable of resolving nonlinear uncertainties compared to GWR [105].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Similarly, the GPM and SMAP input datasets exhibit their own sources of spatiotemporal uncertainties. However, errors in the training data are demonstrated to favor the generalization of regression models and improve their corrective performance against an output target [61]. This is particularly expected from the more robust ANN architecture which is capable of resolving nonlinear uncertainties compared to GWR [105].…”
Section: Discussionmentioning
confidence: 99%
“…Data pre-processing (Section 3) involves further steps to reduce the impact of remaining data quality issues on the training and model performance. Uncertainties from the aforementioned standard quality control steps remain, but may favor the generalization of model correction performance during the training stage [61]. On the other hand, pronounced errors may exist over the northeastern highlands due to terrain blockage and merging uncertainties.…”
Section: Radar-based Rainfall Estimatesmentioning
confidence: 99%
“…The authors in [17] define data quality as "the degree of fulfilment of all those requirements defined for data, which is needed for a specific purpose". On the other hand, according to the authors of [18], the data errors may affect the predictive accuracy of linear regression models in two ways. First, the training data used to build the model may contain errors.…”
Section: Data Qualitymentioning
confidence: 99%
“…Recently, DQ has become more visible as corporations learned from their costly experience. It is well accepted nowadays that for an Enterprise Resource Planning (ERP) or Data Warehouse project to be successful, firms must attend to DQ [7][8][9][10][11][12]. By the same token, in achieving TIA to fight terrorism successfully, DQ must be considered to avoid garbage-ingarbage-out.…”
Section: Research Challengementioning
confidence: 99%
“…The state department of health (CB 4 ) and the HMO (CB 5 ) are the consumers of the two information products in that order. For each of the three products, the set of data items used to generate each is different and is shown by the component data items CD 11 , CD 12 , and CD 13 .…”
Section: Quality Criteriamentioning
confidence: 99%