Abstract. Along with the fast advance of internet technique, internet users have to deal with novel data every day. For most of them, one of the most useful knowledge exploited from web is about the transfer of the information expressed by dynamically updated data. Unfortunately, traditional algorithms often cluster novel data without considering the existent clustering model. They have to cluster input data over again, once input data are updated. Hence, they are time-consuming and inefficient. For efficiently clustering dynamic data, a novel Self-Adaptive Clustering algorithm (abbreviated as SAC) is proposed in this paper. SAC comes from Self Organizing Mapping algorithm (abbreviated as SOM), whereas, it doesn't need to make any assumption about neuron topology beforehand. Besides, when input data are updated, its topology remodels meanwhile. Experiment results demonstrate that SAC can automatically tune its topology along with the update of input data.Keywords: Self-adaptive algorithm, Competitive learning, Minimum spanning tree, Self-organizing-mapping.
IntroductionDue to the fast advance of internet technique, the data from web are unstable and dynamically updated at times (this kind of data is denoted as dynamic data in this paper). This phenomenon forces internet users to face to novel data anywhere and anytime. In general, via clustering dynamic data, it is easy to acquire the knowledge about, what information appears, what information disappears, and what information maintains. This kind of knowledge is essential to the men who need to make the decisions via observing dynamic data. As indicated by [1], there have been proposed many methods to cluster dynamic data as followings.Dhillon et al in [2] just propose a dynamic clustering algorithm to help analyze the transfer of information. Unfortunately, this algorithm is time-consuming and impractical, since it needs to run several times. Ghaseminezhad and Karami in [3] improve this algorithm by employing SOM structure, which forms an initial neuron topology at first and then dynamically tunes its topology once input data are updated.