2020
DOI: 10.1101/2020.08.03.234187
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

GigaSOM.jl: High-performance clustering and visualization of huge cytometry datasets

Abstract: Background: The amount of data generated in large clinical and phenotyping studies that use single-cell cytometry is constantly growing. Recent technological advances allow to easily generate data with hundreds of millions of single-cell data points with more than 40 parameters, originating from thousands of individual samples. The analysis of that amount of high-dimensional data becomes demanding in both hardware and software of high-performance computational resources. Current software tools often do not sca… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…The ability to e ectively work with a simpli ed model of the data di erentiates it from other dimensionality reduction methods; in turn it o ers superior performance by reducing the amount of necessary computation as well as by opening parallelization potential, since the computations of the projections of many individual points are independent. In the setting of ow and mass cytometry data visualization, this provided speedup of several orders of magnitude against the other available methods [14,16].…”
Section: Landmark-directed Dimensionality Reductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The ability to e ectively work with a simpli ed model of the data di erentiates it from other dimensionality reduction methods; in turn it o ers superior performance by reducing the amount of necessary computation as well as by opening parallelization potential, since the computations of the projections of many individual points are independent. In the setting of ow and mass cytometry data visualization, this provided speedup of several orders of magnitude against the other available methods [14,16].…”
Section: Landmark-directed Dimensionality Reductionmentioning
confidence: 99%
“…EmbedSOM provided an order-of-magnitude speedup on datasets typical for the single-cell cytometry data visualization while retaining competitive quality of the results. The concept has proven useful for interactive and high-performance work ows in cytometry [16,14], and easily applies to many other types of datasets. Despite of that, the parallelization potential of the extremely data-regular design of EmbedSOM algorithm has remained mostly untapped.…”
mentioning
confidence: 99%
“…The potential stumbling block here is that the methods required for such clustering are also not native to ecology and evolution, and themselves suffer from being much too complex to interpret for a general biologist audience (see e.g. the GigaSOM method for clustering single-cell cytometry data; where SOM is a 'self organised map', a form of machine learning Kratochvíl et al 2020). Furthermore, it is also unclear how these mechanisms should undergo evolution -in , we suggested using both genetic algorithms and reinforcement learning acting on the simulated mechanisms, based on the similarity of simulated movement paths with real animal movements.…”
Section: The Role Of Models In Understanding the Evolution Of Movementmentioning
confidence: 99%