A novel computational framework for simultaneous integration of multiple types of genomic data to identify microRNA-gene regulatory modules

Zhang, Shihua; Li, Qingjiao; Liu, Juan; Zhou, Xianghong Jasmine

doi:10.1093/bioinformatics/btr206

Cited by 225 publications

(213 citation statements)

References 55 publications

Supporting

Mentioning

212

Contrasting

Order By: Relevance

“…Another joint variable and rank selection method (29) uses l 2 group penalty on the rows of C. However, this algorithm along with another recently proposed method (30) can reduce dimension in predictor space but not response space and does not provide information on independent regulatory programs. A sparse network-regularized multiple nonnegative matrix factorization (SNMNMF), which incorporates the known interaction from literature as a prior information in the parameter estimations, was recently proposed for the inference of miRNA-gene regulation (31). However, SNMNMF identifies only the coexistence relationship between predictors and responses without the estimation of the relative strength or direction of regulation.…”

Section: Resultsmentioning

confidence: 99%

“…Here k · k F is the Frobenius norm. Conditional on U, our estimation procedure with a hardthresholding function is equivalent to an l 0 penalization (31), and hence without the orthogonal constraints, the number of nonzero entries in the final estimate ofV v can be easily shown to be an unbiased estimator of the degrees of freedom of the l 0 penalization. Therefore, we estimate df v by d df v = #ðV!…”

Section: T-svd Model With the Svd Representation Of The Coefficient mentioning

confidence: 99%

See 1 more Smart Citation

Learning regulatory programs by threshold SVD regression

Ma¹,

Xiao

Wong

2014

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

We formulate a statistical model for the regulation of global gene expression by multiple regulatory programs and propose a thresholding singular value decomposition (T-SVD) regression method for learning such a model from data. Extensive simulations demonstrate that this method offers improved computational speed and higher sensitivity and specificity over competing approaches. The method is used to analyze microRNA (miRNA) and long noncoding RNA (lncRNA) data from The Cancer Genome Atlas (TCGA) consortium. The analysis yields previously unidentified insights into the combinatorial regulation of gene expression by noncoding RNAs, as well as findings that are supported by evidence from the literature.regulatory program | SVD | sparse | multivariate | regression T he development of microarray and next-generation sequencing technologies has enabled rapid quantification of various genome-wide features (DNA sequences, gene expressions, noncoding RNA expressions, methylation, etc.) in a population of samples (1, 2). Large consortia have compiled genetic and molecular profiling data in an enormous number of tumors across hundreds of samples (3,4). A common challenge arising from these large-scale genomic studies is the inference of regulatory relationships between different genome-wide measurements from the complex biological systems where the number of predictors and responses often far exceeds the sample size.To formulate a statistical model for such regulatory relations, consider the situation depicted in Fig. 1 (see Fig. S1 for more detailed illustration of the model schema), where p regulators x = ðx 1 ; . . . ; x p Þ regulate q responses y = ðy 1 ; . . . ; y q Þ through r regulatory programs that are represented by hidden nodes, e.g., h 1 ; . . . ; h r . The activity h j of the jth program depends on the regulators connected to hidden node j, and h j in turn affects the level of the responses that are connected to node j. To express this model mathematically, we denote by u j and v j the unit vectors corresponding respectively to the input weights fa ij , i = 1; . . . ; pg and the output weights fb jk , k = 1; . . . ; qg of the jth program. Then the regulatory relations are represented as h j = σðxu j Þ and y = P r j=1 d j h j v′ j , where x, y are regarded as row vectors, u, v are regarded as column vectors, and σ() is a sigmoidal function. The aforementioned is a standard single-layer neural network model that is widely used in predictive modeling but could be impossible to learn in biological studies where sample size n is much smaller than p or q. Thus, we first simplify the model by taking σ to be the identity function. Then our model becomes h j = xu j , y = P r j=1 d j h j v′ j . We make the biologically plausible assumption that only a small subset of regulators is contributing to any program and that each program regulates only a small subset of responses. Under this assumption, u j and v j are sparse vectors in R p and R q , respectively. The magnitude of the output weight vector (denoted by d j ) repr...

show abstract

Section: Resultsmentioning

confidence: 99%

Section: T-svd Model With the Svd Representation Of The Coefficient mentioning

confidence: 99%

Learning regulatory programs by threshold SVD regression

Ma¹,

Xiao

Wong

2014

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

show abstract

“…In order to evaluate the performance of MBCFM and fairly compare it with the other two existing methods of Mirsynergy and SNMNMF in miRNA regulatory modules detection, we apply these three methods to the ovarian cancer dataset processed by Zhang et al [10]. The miRNA and mRNA expression profiles for 385 samples were downloaded from TCGA data portal (http://cancergenome.nih.gov/), each measuring 559 miRNAs and 12456 mRNAs, respectively.…”

Section: Methodsmentioning

confidence: 99%

“…We believe that the mRNAs in R-pair structure have more functional consistency than the mRNAs that are not. In fact, there are several studies showing that miRNA tends to target highly connected mRNAs or proteins in PPI networks, and that the R-pair structure plays important roles in cell function [9][10][11]13]. So, the R-pair structure can be regarded as the core of a CMFM.…”

Section: Definition 1 R-pairmentioning

confidence: 99%

“…For example, SNMNMF (Sparse Network-regularized Multiple Non-negative Matrix Factorization) proposed by Zhang et al [10] described a factorized matrix framework to identify composite miRNA functional modules by integrating miRNA and gene expression profiles, the protein-protein interactions (PPIs) and transcription factor binding sites. Another method called Mirsynergy, which was proposed by Li et al [11], detected miRNA regulatory modules based on synergistic scores between two miRNAs.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Detecting Composite Functional Module in miRNA Regulation and mRNA Interaction Network

Yang

Chu

2017

Algorithms

View full text Add to dashboard Cite

Abstract:The detection of composite miRNA functional module (CMFM) is of tremendous significance and helps in understanding the organization, regulation and execution of cell processes in cancer, but how to identify functional CMFMs is still a computational challenge. In this paper we propose a novel module detection method called MBCFM (detecting Composite Function Modules based on Maximal Biclique enumeration), specifically designed to bicluster miRNAs and target messenger RNAs (mRNAs) on the basis of multiple biological interaction information and topical network features. In this method, we employ algorithm MICA to enumerate all maximal bicliques and further extract R-pairs from the miRNA-mRNA regulatory network. Compared with two existing methods, Mirsynergy and SNMNMF on ovarian cancer dataset, the proposed method of MBCFM is not only able to extract cohesiveness-preserved CMFMs but also has high efficiency in running time. More importantly, MBCFM can be applied to detect other cancer-associated miRNA functional modules.

show abstract