The objective of this work is to introduce a forecasting method for UV-Vis spectrometry time series that combines principal component analysis (PCA) and discrete Fourier transform (DFT), and to compare the results obtained with those obtained by using DFT. Three time series for three different study sites were used: (i) Salitre wastewater treatment plant (WWTP) in Bogotá; (ii) Gibraltar pumping station in Bogotá; and (iii) San Fernando WWTP in Itagüí (in the south part of Medellín). Each of these time series had an equal number of samples (1051). In general terms, the results obtained are hardly generalizable, as they seem to be highly dependent on specific water system dynamics; however, some trends can be outlined: (i) for UV range, DFT and PCA/DFT forecasting accuracy were almost the same; (ii) for visible range, the PCA/DFT forecasting procedure proposed gives systematically lower forecasting errors and variability than those obtained with the DFT procedure; and (iii) for short forecasting times the PCA/DFT procedure proposed is more suitable than the DFT procedure, according to processing times obtained.
This work proposes a methodology for the forecasting of online water quality data provided by UV-Vis spectrometry. Therefore, a combination of principal component analysis (PCA) to reduce the dimensionality of a data set and artificial neural networks (ANNs) for forecasting purposes was used. The results obtained were compared with those obtained by using discrete Fourier transform (DFT). The proposed methodology was applied to four absorbance time series data sets composed by a total number of 5705 UV-Vis spectra. Absolute percentage errors obtained by applying the proposed PCA/ANN methodology vary between 10% and 13% for all four study sites. In general terms, the results obtained were hardly generalizable, as they appeared to be highly dependent on specific dynamics of the water system; however, some trends can be outlined. PCA/ANN methodology gives better results than PCA/DFT forecasting procedure by using a specific spectra range for the following conditions: (i) for Salitre wastewater treatment plant (WWTP) (first hour) and Graz West R05 (first 18 min), from the last part of UV range to all visible range; (ii) for Gibraltar pumping station (first 6 min) for all UV-Vis absorbance spectra; and (iii) for San Fernando WWTP (first 24 min) for all of UV range to middle part of visible range.
Context: The UV-Vis absorbance collection using online optical captors for water quality detection may yield outliers and/or missing values. Therefore, pre-processing to correct these anomalies is required to improve the analysis of monitoring data. The aim of this study is to propose a method to detect outliers as well as to fill-in the gaps in time series. Method: Outliers are detected using Winsorising procedure and the application of the Discrete Fourier Transform (DFT) and the Inverse of Fast Fourier Transform (IFFT) to complete the time series. Together, these tools were used to analyse a case study comprising three sites in Colombia ( (i) Bogotá D.C. Salitre-WWTP (Waste Water Treatment Plant), influent; (ii) Bogotá D.C. Gibraltar Pumping Station (GPS); and, (iii) Itagüí, San Fernando-WWTP, influent (Medellín metropolitan area) ) analysed via UV-Vis (Ultraviolet and Visible) spectra. Results: Outlier detection with the proposed method obtained promising results when window parameter values are small and self-similar, despite that the three time series exhibited different sizes and behaviours. The DFT allowed to process different length gaps having missing values. To assess the validity of the proposed method, continuous subsets (a section) of the absorbance time series without outlier or missing values were removed from the original time series obtaining an average 12 % error rate in the three testing time series. Conclusions: The application of the DFT and the IFFT using the 10 % most important harmonics of useful values, can be advantageous for its later use in different applications, specifically for time series of water quality and quantity in urban sewer systems. One potential application would be the analysis of dry weather affecting rainy seasons, a feature achieved by detecting values that correspond to unusual behaviour in a time series. Additionally, the results hint at the potential of the method in correcting other hydrologic time series.
The time data series of weather stations are a source of information for floods. The study of the previous wintertime series allows knowing the behavior of the variables and the result that will be applied to analysis and simulation models that feed variables such as flow and level of a study area. One of the most common problems is the acquisition and transmission of data from weather stations due to atypical values and lost data; this generates difficulties in the simulation process. Consequently, it is necessary to propose a numerical strategy to solve this problem. The data source for this study is a real database where these problems are presented with different variables of weather. This study is based on comparing three methods of time series analysis to evaluate a multivariable process offline. For the development of the study, we applied a method based on the discrete Fourier transform (DFT), and we contrasted it with methods such as the average and linear regression without uncertainty parameters to complete missing data. The proposed methodology entails statistical values, outlier detection, and the application of the DFT. The application of DFT allows the time series completion, based on its ability to manage various gap sizes and replace missing values. In sum, DFT led to low error percentages for all the time series (1% average). This percentage reflects what would have likely been the shape or pattern of the time series behavior in the absence of misleading outliers and missing data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.