Time Series Forecasting of Univariate Agrometeorological Data: A Comparative Performance Evaluation via One-Step and Multi-Step Ahead Forecasting Strategies
Abstract:High-frequency monitoring of agrometeorological parameters is quintessential in the domain of Precision Agriculture (PA), where timeliness of collected observations and the ability to generate ahead-of-time predictions can substantially impact the crop yield. In this context, state-of-the-art internet-of-things (IoT)-based sensing platforms are often employed to generate, pre-process and assimilate real-time data from heterogeneous sensors and streaming data sources. Simultaneously, Time-Series Forecasting Alg… Show more
“…Sahoo et al [19] also demonstrated the superiority of LSTM over RNN on univariate daily discharge data from the Basantapur gauging station in India's Mahanadi River basin. Suradhaniwar et al [7] demonstrated that SARIMA and SVR models outperform NN, LSTM, and RNN models when hourly averaged univariate time series data is used. Even though the best model generalization is complex, case-based analysis is the most effective method for determining which model best fits a given situation [60].…”
Section: Resultsmentioning
confidence: 99%
“…Univariate forecasting models are straightforward to train using sparse data and provide ease of inference when evaluating forecast performance. Due to the complexity of agrometeorological data, it is simpler and more efficient to forecast the variables individually [7]. On the other hand, multivariate models are designed with multiple variables such as precipitation, temperature, evaporation, and other variables as input and a streamflow variable as output [6].…”
Section: Introductionmentioning
confidence: 99%
“…However, because of the nonlinearity present in streamflow time series, the forecasting performance of these models is usually debatable [3,10]. Under one-step and multistep-ahead forecast scenarios, Suradhaniwar et al [7] compared the performance of Machine Learning (ML) and Deep Learning (DL) based time series forecasting algorithms. ey also evaluated recursive one-step forward forecasts using walk-forward validation.…”
Hydrological forecasting is one of the key research areas in hydrology. Innovative forecasting tools will reform water resources management systems, flood early warning mechanisms, and agricultural and hydropower management schemes. Hence, in this study, we compared Stacked Long Short-Term Memory (S-LSTM), Bidirectional Long Short-Term Memory (Bi-LSTM), and Gated Recurrent Unit (GRU) with the classical Multilayer Perceptron (MLP) network for one-step daily streamflow forecasting. The analysis used daily time series data collected from Borkena (in Awash river basin) and Gummera (in Abay river basin) streamflow stations. All data sets passed through rigorous quality control processes, and null values were filled using linear interpolation. A partial autocorrelation was also applied to select the appropriate time lag for input series generation. Then, the data is split into training and testing datasets using a ratio of 80 : 20, respectively. Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and coefficient of determination (R2) were used to evaluate the performance of the proposed models. Finally, the findings are summarized in model variability, lag time variability, and time series characteristic themes. As a result, time series characteristics (climatic variability) had a more significant impact on streamflow forecasting performance than input lagged time steps and deep learning model architecture variations. Thus, Borkena’s river catchment forecasting result is more accurate than Gummera’s catchment forecasting result, with RMSE, MAE, MAPE, and R2 values ranging between (0.81 to 1.53, 0.29 to 0.96, 0.16 to 1.72, 0.96 to 0.99) and (17.43 to 17.99, 7.76 to 10.54, 0.16 to 1.03, 0.89 to 0.90) for both catchments, respectively. Although the performance is dependent on lag time variations, MLP and GRU outperform S-LSTM and Bi-LSTM on a nearly equal basis.
“…Sahoo et al [19] also demonstrated the superiority of LSTM over RNN on univariate daily discharge data from the Basantapur gauging station in India's Mahanadi River basin. Suradhaniwar et al [7] demonstrated that SARIMA and SVR models outperform NN, LSTM, and RNN models when hourly averaged univariate time series data is used. Even though the best model generalization is complex, case-based analysis is the most effective method for determining which model best fits a given situation [60].…”
Section: Resultsmentioning
confidence: 99%
“…Univariate forecasting models are straightforward to train using sparse data and provide ease of inference when evaluating forecast performance. Due to the complexity of agrometeorological data, it is simpler and more efficient to forecast the variables individually [7]. On the other hand, multivariate models are designed with multiple variables such as precipitation, temperature, evaporation, and other variables as input and a streamflow variable as output [6].…”
Section: Introductionmentioning
confidence: 99%
“…However, because of the nonlinearity present in streamflow time series, the forecasting performance of these models is usually debatable [3,10]. Under one-step and multistep-ahead forecast scenarios, Suradhaniwar et al [7] compared the performance of Machine Learning (ML) and Deep Learning (DL) based time series forecasting algorithms. ey also evaluated recursive one-step forward forecasts using walk-forward validation.…”
Hydrological forecasting is one of the key research areas in hydrology. Innovative forecasting tools will reform water resources management systems, flood early warning mechanisms, and agricultural and hydropower management schemes. Hence, in this study, we compared Stacked Long Short-Term Memory (S-LSTM), Bidirectional Long Short-Term Memory (Bi-LSTM), and Gated Recurrent Unit (GRU) with the classical Multilayer Perceptron (MLP) network for one-step daily streamflow forecasting. The analysis used daily time series data collected from Borkena (in Awash river basin) and Gummera (in Abay river basin) streamflow stations. All data sets passed through rigorous quality control processes, and null values were filled using linear interpolation. A partial autocorrelation was also applied to select the appropriate time lag for input series generation. Then, the data is split into training and testing datasets using a ratio of 80 : 20, respectively. Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and coefficient of determination (R2) were used to evaluate the performance of the proposed models. Finally, the findings are summarized in model variability, lag time variability, and time series characteristic themes. As a result, time series characteristics (climatic variability) had a more significant impact on streamflow forecasting performance than input lagged time steps and deep learning model architecture variations. Thus, Borkena’s river catchment forecasting result is more accurate than Gummera’s catchment forecasting result, with RMSE, MAE, MAPE, and R2 values ranging between (0.81 to 1.53, 0.29 to 0.96, 0.16 to 1.72, 0.96 to 0.99) and (17.43 to 17.99, 7.76 to 10.54, 0.16 to 1.03, 0.89 to 0.90) for both catchments, respectively. Although the performance is dependent on lag time variations, MLP and GRU outperform S-LSTM and Bi-LSTM on a nearly equal basis.
“…It also differs from classical approaches based on sliding windows and thresholds of statistical moments [3]. [1,4,16,32], [1,2,4,8,16,32], [1,3,6,12,24], [1,2,6,12,24], [1,2,4,8,16], [1,4,16], [1,2,4,8], [1,4,8] Blocks 1, 2 Dropout 0…”
Section: Osts Methodsmentioning
confidence: 99%
“…Time series are explored in several types of applications and but not the subject of study until today. Most of these applications are focused on forecasting [32,33] and feature extraction [5] approaches, as can be seen in the following studies.…”
A wide range of applications based on sequential data, named time series, have become increasingly popular in recent years, mainly those based on the Internet of Things (IoT). Several different machine learning algorithms exploit the patterns extracted from sequential data to support multiple tasks. However, this data can suffer from unreliable readings that can lead to low accuracy models due to the low-quality training sets available. Detecting the change point between high representative segments is an important ally to find and thread biased subsequences. By constructing a framework based on the Augmented Dickey-Fuller (ADF) test for data stationarity, two proposals to automatically segment subsequences in a time series were developed. The former proposal, called Change Detector segmentation, relies on change detection methods of data stream mining. The latter, called ADF-based segmentation, is constructed on a new change detector derived from the ADF test only. Experiments over real-file IoT databases and benchmarks showed the improvement provided by our proposals for prediction tasks with traditional Autoregressive integrated moving average (ARIMA) and Deep Learning (Long short-term memory and Temporal Convolutional Networks) methods. Results obtained by the Long short-term memory predictive model reduced the relative prediction error from 1 to 0.67, compared to time series without segmentation.
Time series research in academic and industrial fields has attracted wide attention. However, the frequency information contained in time series still lacks effective modeling. The studies found that time series forecasting relies on different frequency patterns: short-term series forecasting relies more on high-frequency components, while long-term forecasting focuses more on low-frequency data. To better describe the multifrequency mode, a dual-stage multifeature adaptive frequency domain prediction model (DMAFD) is proposed in this paper. DMAFD contains two stages. First, it adopts the XGBoost algorithm to obtain a feature vector by analyzing the feature importance. Second, the frequency feature extraction of time series and the frequency aware modeling of the target sequence is integrated, for building an end-to-end prediction network based on the dependence of time series on frequency mode. The innovation is reflected in the fact that the prediction network can automatically focus on multifrequency components according to the dynamic evolution of the input sequence. Extensive experiments on four real data sets from different fields show that DMAFD obtains higher accuracy and smaller lags in time step analysis compared with state-of-the-art algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.