Compressed Diffusion

Gigante, Scott; Stanley, Jay S.; Vu, Ngan; Dijk, David van; Moon, Kevin R.; Wolf, Guy; Krishnaswamy, Smita

doi:10.1109/sampta45681.2019.9030994

Cited by 7 publications

(6 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, since storing the entries of the powered diffusion operator in memory is also an issue, we employ the use of landmarks earlier in the process. It has also been shown that "compressing" the process of diffusion through landmarks in the fashion described here performs better than simply applying Nystrom extension (which includes landmark MDS [66]) to diffusion maps [68].…”

Section: Scalability Of Phatementioning

confidence: 99%

Visualizing structure and transitions in high-dimensional biological data

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

Section: Scalability Of Phatementioning

confidence: 99%

Visualizing structure and transitions in high-dimensional biological data

et al. 2019

Self Cite

View full text Add to dashboard Cite

show abstract

Section: A13 Scalability Of Phatementioning

confidence: 99%

Visualizing Structure and Transitions for Biological Data Exploration

Dijk

Wang

et al. 2017

Preprint

Self Cite

109

View full text Add to dashboard Cite

In the era of 'Big Data' there is a pressing need for tools that provide human interpretable visualizations of emergent patterns in high-throughput high-dimensional data. Further, to enable insightful data exploration, such visualizations should faithfully capture and emphasize emergent structures and patterns without enforcing prior assumptions on the shape or form of the data. In this paper, we present PHATE (Potential of Heat-diffusion for Affinity-based Transition Embedding) -an unsupervised low-dimensional embedding for visualization of data that is aimed at solving these issues. Unlike previous methods that are commonly used for visualization, such as PCA and tSNE, PHATE is able to capture and highlight both local and global structure in the data. In particular, in addition to clustering patterns, PHATE also uncovers and emphasizes progression and transitions (when they exist) in the data, which are often missed in other visualization-capable methods. Such 24, 2017; patterns are especially important in biological data that contain, for example, single-cell phenotypes at different phases of differentiation, patients at different stages of disease progression, and gut microbial compositions that vary gradually between individuals, even of the same enterotype.International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/120378 doi: bioRxiv preprint first posted online Mar.The embedding provided by PHATE is based on a novel informational distance that captures long-range nonlinear relations in the data by computing energy potentials of dataadaptive diffusion processes. We demonstrate the effectiveness of the produced visualization in revealing insights on a wide variety of biomedical data, including single-cell RNA-sequencing, mass cytometry, gut microbiome sequencing, human SNP data, Hi-C data, as well as non-biomedical data, such as facebook network and facial image data. In order to validate the capability of PHATE to enable exploratory analysis, we generate a new dataset of 31,000 single-cells from a human embryoid body differentiation system. Here, PHATE provides a comprehensive picture of the differentiation process, while visualizing major and minor branching trajectories in the data. We validate that all known cell types are recapitulated in the PHATE embedding in proper organization. Furthermore, the global picture of the system offered by PHATE allows us to connect parts of the developmental progression and characterize novel regulators associated with developmental lineages.

show abstract

“…On this initial coarse-graining we compute the diffusion potential coordinates by employing landmarking as developed in [19]. Landmarking refers to the idea that instead of computing diffusion probabilities between every pair of points, we can compute diffusion probabilities from points to a well-chosen set of central “landmarks” that maintain the geometry of the data.…”

Section: Resultsmentioning

confidence: 99%

“…We have shown in [13] that this leads to high quality approximations of the diffusion operator which lead to near-identical visualizations with PHATE. In addition, we examined in [19] that this leads to low error approximations of diffusion operators in general. We use this fast approach to compute a low error diffusion potential system for our coarse graining process.…”

Section: Methodsmentioning

confidence: 99%

Multiscale PHATE Exploration of SARS-CoV-2 Data Reveals Multimodal Signatures of Disease

Kuchroo

Huang

Wong

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

1SummaryThe biomedical community is producing increasingly high dimensional datasets, integrated from hundreds of patient samples, which current computational techniques struggle to explore. To uncover biological meaning from these complex datasets, we present an approach called Multiscale PHATE, which learns abstracted biological features from data that can be directly predictive of disease. Built on a continuous coarse graining process called diffusion condensation, Multiscale PHATE creates a tree of data granularities that can be cut at coarse levels for high level summarizations of data, as well as at fine levels for detailed representations on subsets. We apply Multiscale PHATE to study the immune response to COVID-19 in 54 million cells from 168 hospitalized patients. Through our analysis of patient samples, we identify CD16hi CD66blo neutrophil and IFNγ+GranzymeB+ Th17 cell responses enriched in patients who die. Further, we show that population groupings Multiscale PHATE discovers can be directly fed into a classifier to predict disease outcome. We also use Multiscale PHATE-derived features to construct two different manifolds of patients, one from abstracted flow cytometry features and another directly on patient clinical features, both associating immune subsets and clinical markers with outcome.

show abstract

Compressed Diffusion

Cited by 7 publications

References 6 publications

Visualizing structure and transitions in high-dimensional biological data

Visualizing structure and transitions in high-dimensional biological data

Visualizing Structure and Transitions for Biological Data Exploration

Multiscale PHATE Exploration of SARS-CoV-2 Data Reveals Multimodal Signatures of Disease

Contact Info

Product

Resources

About