Rapidly growing Global Positioning System (GPS) data plays an important role in trajectory and their applications (e.g., GPS-enabled smart devices). In order to employ K-means to mine the better origins and destinations (OD) behind the GPS data and overcome its shortcomings including slowness of convergence, sensitivity to initial seeds selection, and getting stuck in a local optimum, this paper proposes and focuses on a novel niche genetic algorithm (NGA) with density and noise for K-means clustering (NoiseClust). In NoiseClust, an improved noise method and K-means++ are proposed to produce the initial population and capture higher quality seeds that can automatically determine the proper number of clusters, and also handle the different sizes and shapes of genes. A density-based method is presented to divide the number of niches, with its aim to maintain population diversity. Adaptive probabilities of crossover and mutation are also employed to prevent the convergence to a local optimum. Finally, the centers (the best chromosome) are obtained and then fed into the K-means as initial seeds to generate even higher quality clustering results by allowing the initial seeds to readjust as needed. Experimental results based on taxi GPS data sets demonstrate that NoiseClust has high performance and effectiveness, and easily mine the city's situations in four taxi GPS data sets.
Rapidly growing GPS (Global Positioning System) trajectories hide much valuable information, such as city road planning, urban travel demand, and population migration. In order to mine the hidden information and to capture better clustering results, a trajectory regression clustering method (an unsupervised trajectory clustering method) is proposed to reduce local information loss of the trajectory and to avoid getting stuck in the local optimum. Using this method, we first define our new concept of trajectory clustering and construct a novel partitioning (angle-based partitioning) method of line segments; second, the Lagrange-based method and Hausdorff-based K-means++ are integrated in fuzzy C-means (FCM) clustering, which are used to maintain the stability and the robustness of the clustering process; finally, least squares regression model is employed to achieve regression clustering of the trajectory. In our experiment, the performance and effectiveness of our method is validated against real-world taxi GPS data. When comparing our clustering algorithm with the partition-based clustering algorithms (K-means, K-median, and FCM), our experimental results demonstrate that the presented method is more effective and generates a more reasonable trajectory.
K-means clustering is an important and popular technique in data mining. Unfortunately, for any given dataset (not knowledge-base), it is very difficult for a user to estimate the proper number of clusters in advance, and it also has the tendency of trapping in local optimum when the initial seeds are randomly chosen. The genetic algorithms (GAs) are usually used to determine the number of clusters automatically and to capture an optimal solution as the initial seeds of K-means clustering or K-means clustering results. However, they typically choose the genes of chromosomes randomly, which results in poor clustering results, whereas a generally selected initial population can improve the final clustering results. Hence, some GA-based techniques carefully select a high-quality initial population with a high complexity. This paper proposed an adaptive GA (AGA) with an improved initial population for K-means clustering (SeedClust). In SeedClust, which is an improved density estimation method and the improved K-means++ are presented to capture higher quality initial seeds and generate the initial population with low complexity, and the adaptive crossover and mutation probability is designed and is then used for premature convergence and to maintain the population diversity, respectively, which can automatically determine the proper number of clusters and capture an improved initial solution. Finally, the best chromosomes (centers) are obtained and are then fed into the K-means as initial seeds to generate even higher quality clustering results by allowing the initial seeds to readjust as needed. Experimental results based on low-dimensional taxi GPS (Global Position System) data sets demonstrate that SeedClust has a higher performance and effectiveness.
In the biometric recognition mode, the use of electroencephalogram (EEG) for biometric recognition has many advantages such as anticounterfeiting and nonsteal ability. Compared with traditional biometrics, EEG biometric recognition is safer and more concealed. Generally, EEG-based biometric recognition is to perform person identification (PI) through EEG signals collected by performing motor imagination and visual evoked tasks. The aim of this paper is to improve the performance of different affective EEG-based PI using a channel attention mechanism of convolutional neural dense connection network (CADCNN net) approach. Channel attention mechanism (CA) is used to handle the channel information from the EEG, while convolutional neural dense connection network (DCNN net) extracts the unique biological characteristics information for PI. The proposed method is evaluated on the state-of-the-art affective data set HEADIT. The results indicate that CADCNN net can perform PI from different affective states and reach up to 95%-96% mean correct recognition rate. This significantly outperformed a random forest (RF) and multilayer perceptron (MLP). We compared our method with the state-of-the-art EEG classifiers and models of EEG biometrics. The results show that the further extraction of the feature matrix is more robust than the direct use of the feature matrix. Moreover, the CADCNN net can effectively and efficiently capture discriminative traits, thus generalizing better over diverse human states.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.