The t-distributed Stochastic Neighbor Embedding (t-SNE) method is one of the leading techniques for data visualization and clustering. This method finds lower dimensional embeddings of data points while minimizing distortions in distances between neighboring data points. By construction, t-SNE discards information about large scale structure of the data. We show that adding a global cost function to the t-SNE cost function makes it possible to cluster the data while preserving global inter-cluster data structure. We test the new "global t-SNE" (g-SNE) method on one synthetic and two real data sets on flowers and human brain cells which have significant and meaningful global structures. In all cases, g-SNE outperforms t-SNE in preserving the global structure. The weight parameter λ of the global cost function determines the balance between local and global distances preservations. For the human brain atlas data set, we show the tradeoff of λ in representing global structure of data. Using g-SNE with the optimized λ may therefore yield biological insights into how data is organized on multiple scales.
SummaryNeuronal activity can be modeled as a nonlinear dynamical system to yield measures of neuronal state and dysfunction. The electrical recordings of stem cell-derived neurons from individuals with autism spectrum disorder (ASD) and controls were analyzed using minimum embedding dimension (MED) analysis to characterize their dynamical complexity. MED analysis revealed a significant reduction in dynamical complexity in ASD neurons during differentiation, which was correlated to bursting and spike interval measures. MED was associated with clinical endpoints, such as nonverbal intelligence, and was correlated with 53 differentially expressed genes, which were overrepresented with ASD risk genes related to neurodevelopment, cell morphology, and cell migration. Spatiotemporal analysis also showed a prenatal temporal enrichment in cortical and deep brain structures. Together, we present dynamical analysis as a paradigm that can be used to distinguish disease-associated cellular electrophysiological and transcriptional signatures, while taking into account patient variability in neuropsychiatric disorders.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.