2022
DOI: 10.1007/s10618-022-00894-5
|View full text |Cite
|
Sign up to set email alerts
|

Forecast evaluation for data scientists: common pitfalls and best practices

Abstract: Recent trends in the Machine Learning (ML) and in particular Deep Learning (DL) domains have demonstrated that with the availability of massive amounts of time series, ML and DL techniques are competitive in time series forecasting. Nevertheless, the different forms of non-stationarities associated with time series challenge the capabilities of data-driven ML models. Furthermore, due to the domain of forecasting being fostered mainly by statisticians and econometricians over the years, the concepts related to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
37
0
2

Year Published

2023
2023
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 51 publications
(40 citation statements)
references
References 74 publications
1
37
0
2
Order By: Relevance
“…For every full-year run, 86,520 errors (multiply the forecast horizon by the total number of predictions) were analyzed, together with 515 average errors (one for each prediction made), and a single average error. Which error metric to be used for different datasets can be derived from Hewamalage et al [37]. The Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) were concluded to be the two most common ones used for the STLF of an electrical load in Nti et al [38], and the former was used in this study and presented in the MW.…”
Section: Full-year Runmentioning
confidence: 97%
See 1 more Smart Citation
“…For every full-year run, 86,520 errors (multiply the forecast horizon by the total number of predictions) were analyzed, together with 515 average errors (one for each prediction made), and a single average error. Which error metric to be used for different datasets can be derived from Hewamalage et al [37]. The Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) were concluded to be the two most common ones used for the STLF of an electrical load in Nti et al [38], and the former was used in this study and presented in the MW.…”
Section: Full-year Runmentioning
confidence: 97%
“…The model was trained, and predictions were made according to the inputs, and at some points, it was re-trained. Historical Forecasts uses a rolling window approach for the rolling origin evaluation of the forecast [37]. Each prediction, error, and error metric are saved for further analysis.…”
Section: Model Creationmentioning
confidence: 99%
“…With model recalibration, all model parameters remain constant, meaning that the forecasting model does not fundamentally change. However, unlike the updating of the forecasting model, recalibrating the forecasting model results in all parameters to be reestimated based on the updated window (Hewamalage et al, 2023;Petropoulos et al, 2022;Tashman, 2000). As a result, the forecasting model is ensured to have the best possible fit with the new data (Hewamalage et al, 2023;Tashman, 2000).…”
Section: Out-of-sample Proceduresmentioning
confidence: 99%
“…Certainly having just one metric is not acceptable for the purposes of our task, given all of the different properties or key regularities we would like our models to capture. The choice of right error metrics is critical, even more so in this domain, where problematic time series characteristics such as non-normalities or non-stationarities are often present in the data (Hewamalage et al, 2022 ). These characteristics can make some error measurements susceptible to break down, which can result in spurious conclusions about model performance.…”
Section: Lessons On Performance Measurements and Baselinesmentioning
confidence: 99%