Multi-Approach Bioinformatics Analysis of Curated Omics Data Provides a Gene Expression Panorama for Multiple Cancer Types

Feltes, Bruno César; Poloni, Joice de Faria; Nunes, Itamar José Guimarães; Faria, Sara Socorro; Dorn, Márcio

doi:10.3389/fgene.2020.586602

Cited by 21 publications

(15 citation statements)

References 74 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Furthermore, low METTL7A expression and high METTL7B expression are associated with worse overall and progressive survival in various tumors, including LUAD (Figure 3). These data are in congruence with recent reports on the clinical signi cance of METTL7A and 7B in human cancer 12,[48][49][50] . METTL7A is reported as an integral membrane protein anchored into the endoplasmic reticulum membrane and has roles in lipid droplet formation 51,52 .…”

Section: Discussionsupporting

confidence: 91%

Bioinformatic analysis of methyltransferase-like protein family reveals clinical and functional outcomes in human cancer

Campeanu

Jiang

Liu

et al. 2021

Preprint

View full text Add to dashboard Cite

Human methyltransferase-like (METTL) proteins transfer methyl groups to nucleic acids, proteins, lipids, and other small molecules, subsequently playing important roles in various cellular processes. In this study, we performed integrated genomic, transcriptomic, proteomic, and clinicopathological analyses of 34 METLLs in a large cohort of primary tumor and cell line data. We identified a subset of METTL genes, notably METTL1, METTL7B, and NTMT1, with high frequencies of genomic amplification and/or up-regulation at both the mRNA and protein levels in a spectrum of human cancers. Higher METTL1 expression was associated with high-grade tumors and poor disease prognosis. Loss-of-function analysis in tumor cell lines indicated the biological importance of METTL1, an m7G methyltransferase, in cancer cell growth and survival. Furthermore, functional annotation and pathway analysis of METTL1-associated proteins revealed that, in addition to the METTL1 cofactor WDR4, RNA regulators and DNA packaging complexes may be functionally interconnected with METTL1 in human cancer. Finally, we generated a crystal structure model of the METTL1-WDR4 heterodimeric complex that provides the basis for further development of novel inhibitors. Our results provide a framework for further study of the functional consequences of METTL alterations in human cancer and for development of small inhibitors that target cancer-promoting METTLs.

show abstract

Section: Discussionsupporting

confidence: 91%

Bioinformatic analysis of methyltransferase-like protein family reveals clinical and functional outcomes in human cancer

Campeanu

Jiang

Liu

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Luo et al hypothesized that, since TEX11 is an X-linked gene, its differential expression may be a genetic cause that could explain the higher incidence of CRC in males. Methyltransferase-like protein 7A (METTL7A) belongs to the human methyltransferase-like protein family, and the low METTL7A expression has been associated to cancer aggressiveness and progression in various tumors, including CRC [ 50 , 51 , 52 , 53 ]. Vascular endothelial growth factor A (VEGFA) and its receptors have been identified as major mediators of angiogenesis, which is crucial for tumor invasiveness [ 54 ].…”

Section: Discussionmentioning

confidence: 99%

A Gene-Based Machine Learning Classifier Associated to the Colorectal Adenoma—Carcinoma Sequence

et al. 2021

View full text Add to dashboard Cite

Colorectal cancer (CRC) carcinogenesis is generally the result of the sequential mutation and deletion of various genes; this is known as the normal mucosa–adenoma–carcinoma sequence. The aim of this study was to develop a predictor-classifier during the “adenoma-carcinoma” sequence using microarray gene expression profiles of primary CRC, adenoma, and normal colon epithelial tissues. Four gene expression profiles from the Gene Expression Omnibus database, containing 465 samples (105 normal, 155 adenoma, and 205 CRC), were preprocessed to identify differentially expressed genes (DEGs) between adenoma tissue and primary CRC. The feature selection procedure, using the sequential Boruta algorithm and Stepwise Regression, determined 56 highly important genes. K-Means methods showed that, using the selected 56 DEGs, the three groups were clearly separate. The classification was performed with machine learning algorithms such as Linear Model (LM), Random Forest (RF), k-Nearest Neighbors (k-NN), and Artificial Neural Network (ANN). The best classification method in terms of accuracy (88.06 ± 0.70) and AUC (92.04 ± 0.47) was k-NN. To confirm the relevance of the predictive models, we applied the four models on a validation cohort: the k-NN model remained the best model in terms of performance, with 91.11% accuracy. Among the 56 DEGs, we identified 17 genes with an ascending or descending trend through the normal mucosa–adenoma–carcinoma sequence. Moreover, using the survival information of the TCGA database, we selected six DEGs related to patient prognosis (SCARA5, PKIB, CWH43, TEX11, METTL7A, and VEGFA). The six-gene-based classifier described in the current study could be used as a potential biomarker for the early diagnosis of CRC.

show abstract

“…Class imbalance is common in many real-world applications and affects the quality and reliability of ML approaches (Leevy et al, 2018;Johnson & Khoshgoftaar, 2019;López et al, 2013). Most importantly, class imbalance is the reality of almost all biological datasets, as we demonstrated in previous works after the manual curation of more than 30.000 cancer datasets (Feltes et al, 2019;Feltes, Poloni & Dorn, 2021;Feltes et al, 2020). Imbalanced data refers to classification problems where we have an unequal number of instances for different classes.…”

Section: Introductionmentioning

confidence: 89%

“…Among several ML applications in real-world situations, classification tasks stand up as one of the most relevant applications, ranging from classification of types of plants and animals to the identification of different diseases prognoses, such as cancer (Feltes et al, 2019;Feltes, Poloni & Dorn, 2021;Feltes et al, 2020;, H1N1 Flu (Chaurasia & Dixit, 2021), Dengue (Zhao et al, 2020), and COVID-19 (Table 5). The use of these algorithms in the context of hemogram data from COVID-19 patients is summarized in Table 5.…”

Section: Machine Learning Approachesmentioning

confidence: 99%

Comparison of machine learning techniques to handle imbalanced COVID-19 CBC datasets

Dorn

Grisci

Narloch

et al. 2021

PeerJ Computer Science

Self Cite

View full text Add to dashboard Cite

The Coronavirus pandemic caused by the novel SARS-CoV-2 has significantly impacted human health and the economy, especially in countries struggling with financial resources for medical testing and treatment, such as Brazil’s case, the third most affected country by the pandemic. In this scenario, machine learning techniques have been heavily employed to analyze different types of medical data, and aid decision making, offering a low-cost alternative. Due to the urgency to fight the pandemic, a massive amount of works are applying machine learning approaches to clinical data, including complete blood count (CBC) tests, which are among the most widely available medical tests. In this work, we review the most employed machine learning classifiers for CBC data, together with popular sampling methods to deal with the class imbalance. Additionally, we describe and critically analyze three publicly available Brazilian COVID-19 CBC datasets and evaluate the performance of eight classifiers and five sampling techniques on the selected datasets. Our work provides a panorama of which classifier and sampling methods provide the best results for different relevant metrics and discuss their impact on future analyses. The metrics and algorithms are introduced in a way to aid newcomers to the field. Finally, the panorama discussed here can significantly benefit the comparison of the results of new ML algorithms.

show abstract

Multi-Approach Bioinformatics Analysis of Curated Omics Data Provides a Gene Expression Panorama for Multiple Cancer Types

Cited by 21 publications

References 74 publications

Bioinformatic analysis of methyltransferase-like protein family reveals clinical and functional outcomes in human cancer

Bioinformatic analysis of methyltransferase-like protein family reveals clinical and functional outcomes in human cancer

A Gene-Based Machine Learning Classifier Associated to the Colorectal Adenoma—Carcinoma Sequence

Comparison of machine learning techniques to handle imbalanced COVID-19 CBC datasets

Contact Info

Product

Resources

About