Benzene is a pollutant which is very harmful to our health, so models are necessary to predict its concentration and relationship with other air pollutants. The data collected by eight stations in Madrid (Spain) over nine years were analyzed using the following regression-based machine learning models: multivariate linear regression (MLR), multivariate adaptive regression splines (MARS), multilayer perceptron neural network (MLP), support vector machines (SVM), autoregressive integrated moving-average (ARIMA) and vector autoregressive moving-average (VARMA) models. Benzene concentration predictions were made from the concentration of four environmental pollutants: nitrogen dioxide (NO2), nitrogen oxides (NOx), particulate matter (PM10) and toluene (C7H8), and the performance measures of the model were studied from the proposed models. In general, regression-based machine learning models are more effective at predicting than time series models.
The data obtained from air quality monitoring stations, which are used to carry out studies using data mining techniques, present the problem of missing values. This paper describes a research work on missing data imputation. Among the most common methods, the method that best imputes values to the available data set is analysed. It uses an algorithm that randomly replaces all known values in a dataset once with imputed values and compares them with the actual known values, forming several subsets. Data from seven stations in the Silesian region (Poland) were analyzed for hourly concentrations of four pollutants: nitrogen dioxide (NO2), nitrogen oxides (NOx), particles of 10 μm or less (PM10) and sulphur dioxide (SO2) for five years. Imputations were performed using linear imputation (LI), predictive mean matching (PMM), random forest (RF), k-nearest neighbours (k-NN) and imputation by Kalman smoothing on structural time series (Kalman) methods and performance evaluations were performed. Once the comparison method was validated, it was determine that, in general, Kalman structural smoothing and the linear imputation methods best fitted the imputed values to the data pattern. It was observed that each imputation method behaves in an analogous way for the different stations The variables with the best results are NO2 and SO2. The UMI method is the worst imputer for missing values in the data sets.
SummaryStudy of the Termination of Cervical Nerves Innervating the Rhomboideus, Serratus Ventralis and Trapezius Muscles Part II: Equus and RuminantiaA systematic anatomic study of the origin, course and termination of cervical nerves innervating the Rhomboideus, Serratus ventralis and Trapezius muscles was carried out in the cow, the horse and the sheep. The organization of this innervation is discussed in regard to mesodermal origin of these muscles and the morphogenetic processes demonstrated through innervation.
Air pollution affects human health and is one of the main problems in the world, including in coastal cities with industrial seaports. In this sense, the city of Gijón (northern Spain) stands out as one of the 20 Spanish cities with the worst air quality. The study aims to identify outliers in air quality observations near the El Musel seaport, resulting from the emissions of six pollutants over an eight-year period (2014–2021). It compares methods based on the functional data analysis (FDA) approach and vector methods to determine the optimal approach for detecting outliers and supporting air quality control. Our approach involves analyzing air pollutant observations as a set of curves rather than vectors. Therefore, in the FDA approach, curves are constructed to provide the best fit to isolated data points, resulting in a collection of continuous functions. These functions capture the behavior of the data in a continuous domain. Two FDA approach methodologies were used here: the functional bagplot and the high-density region (HDR) boxplot. Finally, outlier detection using the FDA approach was found to be more powerful than the vector methods and the functional bagplot method detected more outliers than the HDR boxplot.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.