2014
DOI: 10.1186/1471-2105-15-315
|View full text |Cite
|
Sign up to set email alerts
|

A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records

Abstract: BackgroundPrioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic profile of genes with respect to their connection to disease phenotypes. The importance of protein-protein interaction networks in the genetic heterogeneity of common diseases or complex traits is becoming increasingly recog… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(3 citation statements)
references
References 53 publications
0
3
0
Order By: Relevance
“…Finally, LM-PCR libraries were quantified, size-selected by gel extraction, checked by capillary electrophoresis (2100 Bioanalyzer Instrument, Agilent Technologies), and tagged with sample-specific dual indexes and Illumina sequencing adapters (Nextera XT Index Kit, Illumina) before being sequenced to saturation on the MiSeq instrument (Illumina). The resulting genomic sequences were demultiplexed by sample-specific tag and bioinformatically processed, trimmed by Skewer 68 and mapped on the human genome (February 2009, GRCh37/hg19) by Bowtie2 software. 69 During the mapping step, a quality score was assigned to each insertion site based on the total number of trimmed reads collapsed in that genomic position (read count) and the mean read quality.…”
Section: Methodsmentioning
confidence: 99%
“…Finally, LM-PCR libraries were quantified, size-selected by gel extraction, checked by capillary electrophoresis (2100 Bioanalyzer Instrument, Agilent Technologies), and tagged with sample-specific dual indexes and Illumina sequencing adapters (Nextera XT Index Kit, Illumina) before being sequenced to saturation on the MiSeq instrument (Illumina). The resulting genomic sequences were demultiplexed by sample-specific tag and bioinformatically processed, trimmed by Skewer 68 and mapped on the human genome (February 2009, GRCh37/hg19) by Bowtie2 software. 69 During the mapping step, a quality score was assigned to each insertion site based on the total number of trimmed reads collapsed in that genomic position (read count) and the mean read quality.…”
Section: Methodsmentioning
confidence: 99%
“…Literature in the biomedical domain, as a significant addition to experimental data, has been broadly used by researchers for the inference of gene regulatory network [6], analysis of the relationship between drugs, genes and diseases, and other biomedical research purposes. For example, researchers inferred disease-disease associations [7] from PubMed abstracts and biological pathways and used large-scale knowledge-bases such as the Online Mendelian Inheritance in Man (OMIM) to find the disease-causing genes [8, 9].…”
Section: Introductionmentioning
confidence: 99%
“…Recently, some computational approaches have been proposed to prioritize candidate disease genes from Protein-Protein Interaction (PPI) network [5][6][7] . The PPI network is one of the most important biological networks which has been widely used to predict protein functions [8][9][10] , detect protein complexes [11,12] , identify essential proteins or genes [13,14] , and discover network motifs [15] .…”
Section: Introductionmentioning
confidence: 99%