A self-organizing map (SOM) is a self-organized projection of high-dimensional data onto a typically 2-dimensional (2-D) feature map, wherein vector similarity is implicitly translated into topological closeness in the 2-D projection. However, when there are more neurons than input patterns, it can be challenging to interpret the results, due to diffuse cluster boundaries and limitations of current methods for displaying interneuron distances. In this brief, we introduce a new cluster reinforcement (CR) phase for sparsely-matched SOMs. The CR phase amplifies within-cluster similarity in an unsupervised, data-driven manner. Discontinuities in the resulting map correspond to between-cluster distances and are stored in a boundary (B) matrix. We describe a new hierarchical visualization of cluster boundaries displayed directly on feature maps, which requires no further clustering beyond what was implicitly accomplished during self-organization in SOM training. We use a synthetic benchmark problem and previously published microbial community profile data to demonstrate the benefits of the proposed methods.
We propose NM landscapes as a new class of tunably rugged benchmark problems. NM landscapes are well defined on alphabets of any arity, including both discrete and real-valued alphabets, include epistasis in a natural and transparent manner, are proven to have known value and location of the global maximum and, with some additional constraints, are proven to also have a known global minimum. Empirical studies are used to illustrate that, when coefficients are selected from a recommended distribution, the ruggedness of NM landscapes is smoothly tunable and correlates with several measures of search difficulty. We discuss why these properties make NM landscapes preferable to both NK landscapes and Walsh polynomials as benchmark landscape models with tunable epistasis.
In organized healthcare quality improvement collaboratives (QICs), teams of practitioners from different hospitals exchange information on clinical practices with the aim of improving health outcomes at their own institutions. However, what works in one hospital may not work in others with different local contexts because of nonlinear interactions among various demographics, treatments, and practices. In previous studies of collaborations where the goal is a collective problem solving, teams of diverse individuals have been shown to outperform teams of similar individuals. However, when the purpose of collaboration is knowledge diffusion in complex environments, it is not clear whether team diversity will help or hinder effective learning. In this paper, we first use an agent-based model of QICs to show that teams comprising similar individuals outperform those with more diverse individuals under nearly all conditions, and that this advantage increases with the complexity of the landscape and level of noise in assessing performance. Examination of data from a network of real hospitals provides encouraging evidence of a high degree of similarity in clinical practices, especially within teams of hospitals engaging in QIC teams. However, our model also suggests that groups of similar hospitals could benefit from larger teams and more open sharing of details on clinical outcomes than is currently the norm. To facilitate this, we propose a secure virtual collaboration system that would allow hospitals to efficiently identify potentially better practices in use at other institutions similar to theirs without any institutions having to sacrifice the privacy of their own data. Our results may also have implications for other types of data-driven diffusive learning such as in personalized medicine and evolutionary search in noisy, complex combinatorial optimization problems.
We introduce a new method for exploratory analysis of large data sets with time-varying features, where the aim is to automatically discover novel relationships between features (over some time period) that are predictive of any of a number of time-varying outcomes (over some other time period). Using a genetic algorithm, we co-evolve (i) a subset of predictive features, (ii) which attribute will be predicted (iii) the time period over which to assess the predictive features, and (iv) the time period over which to assess the predicted attribute. After validating the method on 15 synthetic test problems, we used the approach for exploratory analysis of a large healthcare network data set. We discovered a strong association, with 100% sensitivity, between hospital participation in multi-institutional quality improvement collaboratives during or before 2002, and changes in the risk-adjusted rates of mortality and morbidity observed after a 1-2 year lag. The results provide indirect evidence that these quality improvement collaboratives may have had the desired effect of improving health care practices at participating hospitals. The proposed approach is a potentially powerful and general tool for exploratory analysis of a wide range of time-series data sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.