Network-based analysis of omics data: the LEAN method

Gwinner, Frederik; Boulday, Gwénola; Vandiedonck, Claire; Arnould, Minh; Cardoso, Cécile; Nikolayeva, Iryna; Guitart-Pla, Oriol; Denis, Cécile V.; Olivier, Christophe; Beghain, Johann; Tournier‐Lasserve, Elisabeth; Schwikowski, Benno

doi:10.1093/bioinformatics/btw676

Cited by 23 publications

(19 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To ensure comparability between our method and LEAN, we use the same network and expression data for inputs to LEAN that we used for GeneSurrounder. Again, we consider each pair of the three studies and calculate the correlation between our results and the correlation between results of LEAN [19] (which is available as an R package on CRAN). The results are given in Table 4.…”

Section: Resultsmentioning

confidence: 99%

“…Our analysis technique addresses these shortcomings by using the shortest direct distance on a global network and not requiring any prior biological knowledge. LEAN [19] considers interactions on a global interaction network and is closest to our method in this respect, but restricts its focus to nearest neighbors on the network and does not determine whether a putative disease gene is the source of change on the network.…”

Section: Discussionmentioning

confidence: 99%

“…Most recently, LEAN [19] was developed to use direct interactions on a global interaction network and find disease genes by scoring the differential expression of “local subnetworks.” LEAN scores each gene for disease-association according to the enrichment of its immediate neighbors. Thus, LEAN’s algorithm restricts its focus to a local subnetwork that only considers nearest neighbors.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

GeneSurrounder: network-based identification of disease genes in expression data

Shah

Braun

2019

BMC Bioinformatics

View full text Add to dashboard Cite

Background A key challenge of identifying disease–associated genes is analyzing transcriptomic data in the context of regulatory networks that control cellular processes in order to capture multi-gene interactions and yield mechanistically interpretable results. One existing category of analysis techniques identifies groups of related genes using interaction networks, but these gene sets often comprise tens or hundreds of genes, making experimental follow-up challenging. A more recent category of methods identifies precise gene targets while incorporating systems-level information, but these techniques do not determine whether a gene is a driving source of changes in its network, an important characteristic when looking for potential drug targets. Results We introduce GeneSurrounder, an analysis method that integrates expression data and network information in a novel procedure to detect genes that are sources of dysregulation on the network. The key idea of our method is to score genes based on the evidence that they influence the dysregulation of their neighbors on the network in a manner that impacts cell function. Applying GeneSurrounder to real expression data, we show that our method is able to identify biologically relevant genes, integrate pathway and expression data, and yield more reproducible results across multiple studies of the same phenotype than competing methods. Conclusions Together these findings suggest that GeneSurrounder provides a new avenue for identifying individual genes that can be targeted therapeutically. The key innovation of GeneSurrounder is the combination of pathway network information with gene expression data to determine the degree to which a gene is a source of dysregulation on the network. By prioritizing genes in this way, our method provides insights into disease mechanisms and suggests diagnostic and therapeutic targets. Our method can be used to help biologists select among tens or hundreds of genes for further validation. The implementation in R is available at github.com/sahildshah1/gene-surrounder. Electronic supplementary material The online version of this article (10.1186/s12859-019-2829-y) contains supplementary material, which is available to authorized users.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

GeneSurrounder: network-based identification of disease genes in expression data

Shah

Braun

2019

BMC Bioinformatics

View full text Add to dashboard Cite

show abstract

“…LEAN searches altered “star” subnetworks, that is, subnetworks composed of one central node and all its interactors [13]. By imposing this restriction, LEAN can exhaustively test all such subnetworks (one per node).…”

Section: Methodsmentioning

confidence: 99%

Biological networks and GWAS: comparing and combining network methods to understand the genetics of familial breast cancer susceptibility in the GENESIS study

Climente-González

Lonjou

Lesueur

et al. 2020

Preprint

View full text Add to dashboard Cite

Systems biology provides a comprehensive approach to biomarker discovery and biological hypothesis building. Indeed, it allows to jointly consider the statistical association between gene variation and a phenotype, and the biological context of each gene, represented as a network. In this work, we study six network methods which identify subnetworks with high association scores to a phenotype. Specifically, we examine their utility to discover new biomarkers for breast cancer susceptibility by interrogating a genome-wide association study (GWAS) focused on French women with a family history of breast cancer and tested negative for pathogenic variants in BRCA1 and BRCA2. We perform an in-depth benchmarking of the methods with regards to size of the solution subnetwork, their utility as biomarkers, and the stability and the runtime of the methods. By trading statistical stringency for biological meaningfulness, most network methods give more compelling results than standard SNP-and gene-level analyses, recovering causal subnetworks tightly related to cancer susceptibility. For instance, we show a general alteration of the neighborhood of COPS5, a gene related to multiple hallmarks of cancer. Importantly, we find a significantly large overlap between the genes in the solution networks and the genes significantly associated in the largest GWAS on susceptibility to breast cancer. Yet, network methods are notably unstable, producing di erent results when the input data changes slightly. To account for that, we produce a stable consensus subnetwork, formed by the most consistently selected genes. The stable consensus is composed of 68 genes, enriched in known breast cancer susceptibility genes (BLM, CASP8, CASP10, DNAJC1, FGFR2, MRPS30, and SLC4A7, Fisher's exact test P-value = 3 ◊ 10 ≠4 ) and occupying more central positions in the network than average. The network seems organized around CUL3, encoding an ubiquitin ligase related protein that regulates the protein levels of 1 several genes involved in cancer progression. In conclusion, this article shows the pertinence of network-based analyses to tackle known issues with GWAS, namely lack of statistical power and of interpretable solutions. Project-agnostic implementations of each of the network methods are available at https://github.com/hclimente/gwas-tools to facilitate their application to other GWAS datasets. NetworksGene network The statistical frameworks of the di erent network methods are compatible with any type of network (protein interactions, gene coexpression, regulatory, etc.). Yet, we used proteinprotein interaction networks (PPIN) for all of them except SConES, as they are interpretable, well characterized, and they were designed to run e ciently on networks of their size. We built our PPIN from both binary and co-complex interactions stored in the HINT database (release April 2019) [17]. Unless otherwise specified, we used only interactions coming from high-throughput experiments, leaving out targeted studies that might bias the topology of the networ...

show abstract

“…An enrichment analysis based on the network of a pathway, rather than simply the gene set of the pathway, takes into consideration the interactions between the genes in the pathway. We use PathwayCommons network databases [8] and the local enrichment analysis (LEAN) method of Gwinner et al [9] for network analysis of target lists, and then assess the resulting output for biological pathway enrichment using a hypergeometric test. The results of the analysis are presented in the Metamatched database and are also included in this paper’s supplementary material as an R [10] archive file (S3 File).…”

Section: Introductionmentioning

confidence: 99%

A meta-analysis of multiple matched aCGH/expression cancer datasets reveals regulatory relationships and pathway enrichment of potential oncogenes

Newton

Wernisch

2019

PLoS ONE

View full text Add to dashboard Cite

The copy numbers of genes in cancer samples are often highly disrupted and form a natural amplification/deletion experiment encompassing multiple genes. Matched array comparative genomics and transcriptomics datasets from such samples can be used to predict inter-chromosomal gene regulatory relationships. Previously we published the database METAMATCHED, comprising the results from such an analysis of a large number of publically available cancer datasets. Here we investigate genes in the database which are unusual in that their copy number exhibits consistent heterogeneous disruption in a high proportion of the cancer datasets. We assess the potential relevance of these genes to the pathology of the cancer samples, in light of their predicted regulatory relationships and enriched biological pathways. A network-based method was used to identify enriched pathways from the genes’ inferred targets. The analysis predicts both known and new regulator-target interactions and pathway memberships. We examine examples in detail, in particular the gene POGZ , which is disrupted in many of the cancer datasets and has an unusually large number of predicted targets, from which the network analysis predicts membership of cancer related pathways. The results suggest close involvement in known cancer pathways of genes exhibiting consistent heterogeneous copy number disruption. Further experimental work would clarify their relevance to tumor biology. The results of the analysis presented in the database METAMATCHED, and included here as an R archive file, constitute a large number of predicted regulatory relationships and pathway memberships which we anticipate will be useful in informing such experiments.

show abstract

Network-based analysis of omics data: the LEAN method

Cited by 23 publications

References 34 publications

GeneSurrounder: network-based identification of disease genes in expression data

GeneSurrounder: network-based identification of disease genes in expression data

Biological networks and GWAS: comparing and combining network methods to understand the genetics of familial breast cancer susceptibility in the GENESIS study

A meta-analysis of multiple matched aCGH/expression cancer datasets reveals regulatory relationships and pathway enrichment of potential oncogenes

Contact Info

Product

Resources

About