a b s t r a c tIn clinical medicine, multidimensional time series data can be used to find the rules of disease progress by data mining technology, such as classification and prediction. However, in multidimensional time series data mining problems, the excessive data dimension causes the inaccuracy of probability density distribution to increase the computational complexity. Besides, information redundancy and irrelevant features may lead to high computational complexity and over-fitting problems. The combination of these two factors can reduce the classification performance. To reduce computational complexity and to eliminate information redundancies and irrelevant features, we improved upon a multidimensional time series feature selection method to achieve dimension reduction. The improved method selects features through the combination of the Kozachenko-Leonenko (K-L) information entropy estimation method for feature extraction based on mutual information and the feature selection algorithm based on class separability. We performed experiments on the Electroencephalogram (EEG) dataset for verification and the non-small cell lung cancer (NSCLC) clinical dataset for application. The results show that with the comparison of CLeVer, Corona and AGV, respectively, the improved method can effectively reduce the dimensions of multidimensional time series for clinical data.
Nowadays, freshwater resources are facing numerous crises and pressures, resulting from both artificial and natural process, so it is crucial to predict the water quality for the department of water environment protection. This paper proposes a hybrid optimized algorithm involving a particle swarm optimization (PSO) and genetic algorithm (GA) combined BP neural network that can predict the water quality in time series and has good performance in Beihai Lake in Beijing. The data sets consist of six water quality parameters which include Hydrogen Ion Concentration (pH), Chlorophyll-a (CHLA), Hydrogenated Amine (NH4H), Dissolved Oxygen (DO), Biochemical Oxygen Demand (BOD), and electrical conductivity (EC). The performance of the model was assessed through the absolute percentage error ( A P E m a x ), the mean absolute percentage error (MAPE), the root mean square error (RMSE), and the coefficient of determination ( R 2 ). Study results show that the model based on PSO and GA to optimize the BP neural network is able to predict the water quality parameters with reasonable accuracy, suggesting that the model is a valuable tool for lake water quality estimation. The results show that the hybrid optimized BP model has a higher prediction capacity and better robustness of water quality parameters compared with the traditional BP neural network, the PSO-optimized BP neural network, and the GA-optimized BP neural network.
Recently, the quality of fresh water resources is threatened by numerous pollutants. Prediction of water quality is an important tool for controlling and reducing water pollution. By employing superior big data processing ability of deep learning it is possible to improve the accuracy of prediction. This paper proposes a method for predicting water quality based on the deep belief network (DBN) model. First, the particle swarm optimization (PSO) algorithm is used to optimize the network parameters of the deep belief network, which is to extract feature vectors of water quality time series data at multiple scales. Then, combined with the least squares support vector regression (LSSVR) machine which is taken as the top prediction layer of the model, a new water quality prediction model referred to as PSO-DBN-LSSVR is put forward. The developed model is valued in terms of the mean absolute error (MAE), the mean absolute percentage error (MAPE), the root mean square error (RMSE), and the coefficient of determination ( R 2 ). Results illustrate that the model proposed in this paper can accurately predict water quality parameters and better robustness of water quality parameters compared with the traditional back propagation (BP) neural network, LSSVR, the DBN neural network, and the DBN-LSSVR combined model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.