2020
DOI: 10.1101/2020.09.03.269555
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Training a neural network to learn other dimensionality reduction removes data size restrictions in bioinformatics and provides a new route to exploring data representations

Abstract: High dimensionality omics and hyperspectral imaging datasets present difficult challenges for feature extraction and data mining due to huge numbers of features that cannot be simultaneously examined. The sample numbers and variables of these methods are constantly growing as new technologies are developed, and computational analysis needs to evolve to keep up with growing demand. Current state of the art algorithms can handle some routine datasets but struggle when datasets grow above a certain size. We prese… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 45 publications
0
6
0
Order By: Relevance
“…For this reason, people have started exploring supervised ML algorithms such as random forest and unsupervised ML algorithms to analyze MSI data. ToF-SIMS data has been successfully analyzed using various ANNs in the form of self-organizing maps (SOMs) or ANNs in combination with t-distributed stochastic neighbor embedding (t-SNE) . The use of ML algorithms makes it possible to reveal chemical differences in MSI data with much greater ease and less human bias, but MSI in general is behind the curve when it comes to advanced data analysis compared to other fields.…”
Section: Mass Spectrometry Imagingmentioning
confidence: 99%
See 1 more Smart Citation
“…For this reason, people have started exploring supervised ML algorithms such as random forest and unsupervised ML algorithms to analyze MSI data. ToF-SIMS data has been successfully analyzed using various ANNs in the form of self-organizing maps (SOMs) or ANNs in combination with t-distributed stochastic neighbor embedding (t-SNE) . The use of ML algorithms makes it possible to reveal chemical differences in MSI data with much greater ease and less human bias, but MSI in general is behind the curve when it comes to advanced data analysis compared to other fields.…”
Section: Mass Spectrometry Imagingmentioning
confidence: 99%
“…ToF-SIMS data has been successfully analyzed using various ANNs 237 in the form of self-organizing maps (SOMs) 238 or ANNs in combination with t-distributed stochastic neighbor embedding (t-SNE). 239 The use of ML algorithms makes it possible to reveal chemical differences in MSI data with much greater ease and less human bias, but MSI in general is behind the curve when it comes to advanced data analysis compared to other fields. Just like the ToF-SIMS field has been able to learn from the electron microscopy community in terms of sample preparation, it will be necessary for MSI to learn from computer scientists and engineers who have already been applying these techniques for decades.…”
Section: ■ Mass Spectrometry Imagingmentioning
confidence: 99%
“…The is, we know that biologically similar regions have similar chemical profiles and similar profiles will be grouped together in the embedded space to form dense regions. This has been observed in a number of studies that use dimensionality reduction of mass spectrometry data [11], [12], [23]- [25]. In the case where the data are homogeneous, clustering is not possible and a test for homogeneity can detect this automatically.…”
Section: Density Based Estimation Of Cluster Numbermentioning
confidence: 98%
“…Methods such as t-distributed stochastic neighbour embedding (t-SNE) [5] are state of the art techniques data reduction and visualisation. However the lack of a known mapping prohibits the application to unseen data [12]. Autoencoders avoid this issue by learning the encoding and decoding transformation during training of the model [13].…”
Section: Introductionmentioning
confidence: 99%
“…The receptive field defines a convolutional kernel window in these CNN architectures to identify salient mass spectral patterns that depend on the selected size of the receptive field (Behrmann et al, 2018). Fully connected neural networks (FCNN) were applied on MSI data to perform non-linear dimensionality reduction (Thomas et al, 2016;Inglese et al, 2017;Dexter et al, 2020), and we recently applied FCNN-based architecture to capture spatial patterns and learn underlying m/z peaks of interest from large scale MSI data while bypassing conventional preprocessing (Abdelmoula et al, 2020).…”
Section: Introductionmentioning
confidence: 99%