The k-means is one of the most popular clustering analysis algorithm and widely used in various fields. Nevertheless, it continues to have some shortcomings, for example, extremely sensitive to the initial center points selection and the special points such as noise or outliers. Therefore, this paper proposed initial center points’ selection optimization and phased assignment optimization to improve the k-means algorithm. The experimental results on 15 real-world and 10 synthetic datasets show that the improved k-means outperforms its main competitor k-means
+
+
and under the same setting conditions, namely, using the default parameters,its clustering performance is better than Affinity Propagation, Mean Shift, and DBSCAN. The proposed algorithm was applied to analyze the airline seat selection data to air passengers grouping. The clustering results, as well as absolute deviation rate analysis, realized customer grouping and found out suitable audience group for the recommendation of seat selection services.