The self-organizing map is a prominent unsupervised neural network model which lends itself to the analysis of high-dimensional input data and data mining applications. However; the high execution times required to train the map put a limit to its application in many high-performance data analysis application domains.In this paper we discuss the ,,&OM imp!ementation, a software-based parallel implementation of the seljorganizing map, and its optimization for the analysis of high-dimensional input data using distributed memory systems and clusters. The original r,&OM algorithm scales very well in a parallel execution environment with low communication latencies and exploits parallelism to cope with memory latencies. However it suffers from poor scalability on distributed memory computers. We present optimizations to further decouple the subprocesses, simplifi the conimunication model and improve the portabilit?, of the sj'stem. 0-7695-0680-1/00 $10.00 0 2000 IEEE
A large number of applications has shown, that the self-organizing map is a prominent unsupervised neural network model for high-dimensional data analysis. However, the high execution times required to train the map put a limit to its use in many application domains, where either very large datasets are encountered and/or interactive response times are required.In order to provide interactive response times during data analysis we developed the ,.3OM, a softwarebased parallel implementation of the self-organizing map Parallel execution reduces the training time to a large degree, with an even higher speedup obtained by using the resulting cache effects. We demonstrate the scalability of the ,.,SOM system and the speed-up obtained on different architectures using an example from high-dimensional text data classification.
During the last few years, On-Line Analytical Processing (OLAP) has emerged as a valuable tool for the analysis, navigation and reporting of hierarchically organized data from data warehouses. Still, it remains a challenging task to implement and deploy an OLAP system, since no standardized architecture exists, which describes the common components and functionality of OLAP systems. Additionally, the formal models in use disregard the need for easily implemented and clearly defined interfaces between these components. This paper presents a model for OLAP engines, which permits the development of modular systems based on a simple data-representation using sets and vectors. The functional units of the query processor are implemented in CORBA as independent modules with firm interfaces and exchange data and messages communicate across a software bus.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.