A Review on Initialization Methods for Nonnegative Matrix Factorization: Towards Omics Data Experiments

Esposito, Flavia

doi:10.3390/math9091006

Cited by 33 publications

(10 citation statements)

References 79 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Reproducibility is yet another challenge for NMF. Alternating least squares requires an initialization, such as a random or non-negative double SVD model (NNDSVD) (Esposito, 2021). While NNDSVD is "robust", it differs fundamentally in nature from NMF, and any non-random initialization can trap updates into a local minimum even if random noise is added to the model and zeros are filled.…”

Section: Discussionmentioning

confidence: 99%

Fast and interpretable non-negative matrix factorization for atlas-scale single cell data

DeBruine

Melcher

Triche

2021

Preprint

View full text Add to dashboard Cite

Non-negative matrix factorization (NMF) is an intuitively appealing method to extract additive combinations of measurements from noisy or complex data. NMF is applied broadly to text and image processing, time-series analysis, and genomics, where recent technological advances permit sequencing experiments to measure the representation of tens of thousands of features in millions of single cells. In these experiments, a count of zero for a given feature in a given cell may indicate either the absence of that feature or an insufficient read coverage to detect that feature ("dropout"). Unlike spectral decompositions such as Singular Value Decomposition (SVD) or Principal Component Analysis (PCA), NMF is an ideal method for handling single-cell data with ambiguous zeros due to its strictly positive imputation of signal. While single-cell datasets contain many ambiguous zero counts, most analysis pipelines apply SVD or PCA on transformed counts because these implementations are fast and current NMF implementations are slow. We present an accessible NMF implementation that is much faster than PCA and rivals the runtimes of state-of-the-art SVD. NMF models learned with our implementation from raw count matrices yield intuitive summaries of complex biological processes, capturing coordinated gene activity and enrichment of sample metadata. Our NMF implementation, available in the RcppML (Rcpp Machine Learning library) R package, improves upon current NMF implementations by introducing a scaling diagonal to enable convex L1 regularization for feature engineering, reproducible factor scalings, and symmetric factorizations. RcppML NMF easily handles sparse datasets with millions of samples, making NMF an attractive replacement for PCA in the analysis of single-cell experiments.

show abstract

Section: Discussionmentioning

confidence: 99%

Fast and interpretable non-negative matrix factorization for atlas-scale single cell data

DeBruine

Melcher

Triche

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The non‐negative constraint is particularly useful for facilitating the interpretation of latent factors within spectral data. The majority of NMF methods are iterative and converge to a local minima; however, the initialisation of the algorithm is important in determining the outputs, and random initialisation can influence the convergence and stability of the final solution [ 31 ]. Herein, we used a method shown to generate sparse initial factors [ 25 ], although other approaches are available [ 31 ].…”

Section: Resultsmentioning

confidence: 99%

Non‐negative matrix factorisation of Raman spectra finds common patterns relating to neuromuscular disease across differing equipment configurations, preclinical models and human tissue

Alix

Plesia

Schooling

et al. 2022

J Raman Spectroscopy

View full text Add to dashboard Cite

Raman spectroscopy shows promise as a biomarker for complex nerve and muscle (neuromuscular) diseases. To maximise its potential, several challenges remain. These include the sensitivity to different instrument configurations, translation across preclinical/human tissues and the development of multivariate analytics that can derive interpretable spectral outputs for disease identification. Nonnegative matrix factorisation (NMF) can extract features from high-dimensional data sets and the nonnegative constraint results in physically realistic outputs. In this study, we have undertaken NMF on Raman spectra of muscle obtained from different clinical and preclinical settings. First, we obtained and combined Raman spectra from human patients with mitochondrial disease and healthy volunteers, using both a commercial microscope and in-house fibre optic probe. NMF was applied across all data, and spectral patterns common to both equipment configurations were identified. Linear

show abstract

“…Our implementation of NMF is a modification of the implementation found in Python's Scikit-Learn package (41). We manually initialize W and H, as opposed to the automated method found in the package, due to the wide variety of initialization methods possible for NMF (42,43), and to maintain consistent, precise, and easily reproduceable control over the initial conditions of our models. For the analyses found in this paper, we apply non-negative dual singular value decomposition (nndsvd), a consistent and efficient initialization method (44).…”

Section: Dimensionality Reduction (Dr) Methodsmentioning

confidence: 99%

Non-Negative Matrix Factorization for Analyzing State Dependent Neuronal Network Dynamics in Calcium Recordings

Carbonero,

Noueihed,

Kramer

et al. 2023

Preprint

View full text Add to dashboard Cite

Calcium imaging allows recording from hundreds of neurons in vivo with the ability to resolve single cell activity. Evaluating and analyzing neuronal responses, while also considering all dimensions of the data set to make specific conclusions, is extremely difficult. Often, descriptive statistics are used to analyze these forms of data. These analyses, however, remove variance by averaging the responses of single neurons across recording sessions, or across combinations of neurons to create single quantitative metrics, losing the temporal dynamics of neuronal activity, and their responses relative to each other. Dimensionally Reduction (DR) methods serve as a good foundation for these analyses because they reduce the dimensions of the data into components, while still maintaining the variance. Non-negative Matrix Factorization (NMF) is an especially promising DR analysis method for calcium imaging because of its mathematical constraints, which include positivity and linearity. We adapt NMF for our analyses and compare its performance to alternative dimensionality reduction methods on both artificial and in vivo data. We find that NMF is well-suited for analyzing calcium imaging recordings, accurately capturing the underlying dynamics of the data, and outperforming alternative methods in common use.

show abstract

A Review on Initialization Methods for Nonnegative Matrix Factorization: Towards Omics Data Experiments

Cited by 33 publications

References 79 publications

Fast and interpretable non-negative matrix factorization for atlas-scale single cell data

Fast and interpretable non-negative matrix factorization for atlas-scale single cell data

Non‐negative matrix factorisation of Raman spectra finds common patterns relating to neuromuscular disease across differing equipment configurations, preclinical models and human tissue

Non-Negative Matrix Factorization for Analyzing State Dependent Neuronal Network Dynamics in Calcium Recordings

Contact Info

Product

Resources

About