A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation

Haider, Saad; Black, Michael B.; Parks, Bethany; Foley, Briana; Wetmore, Barbara A.; Andersen, Melvin E.; Clewell, Rebecca A.; Mansouri, Kamel; McMullen, Patrick D.

doi:10.3389/fphar.2018.01072

Cited by 10 publications

(4 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It worth nothing that about 33.7% of genes are shared between both signatures. Even though some differences can be realized between L1000 and S1500, they are both strong candidates of gene expression modeling and prediction (Haider et al 2018).…”

Section: Manuscript To Be Reviewedmentioning

confidence: 99%

Peer Review #1 of "T1000: a reduced gene set prioritized for toxicogenomic studies (v0.2)"

2019

View full text Add to dashboard Cite

There is growing interest within regulatory agencies and toxicological research communities to develop, test, and apply new approaches, such as toxicogenomics, to more efficiently evaluate chemical hazards. Given the complexity of analyzing thousands of genes simultaneously, there is a need to identify reduced gene sets. Though several gene sets have been defined for toxicological applications, few of these were purposefully derived using toxicogenomics data. Here, we developed and applied a systematic approach to identify 1000 genes (called Toxicogenomics-1000 or T1000) highly responsive to chemical exposures. First, a co-expression network of 11,210 genes was built by leveraging microarray data from the Open TG-GATEs program. This network was then reweighted based on prior knowledge of their biological (KEGG, MSigDB) and toxicological (CTD) relevance. Finally, weighted correlation network analysis was applied to identify 258 gene clusters. T1000 was defined by selecting genes from each cluster that were most associated with outcome measures. For model evaluation, we compared the performance of T1000 to that of other gene sets (L1000, S1500, Genes selected by Limma, and random set) using two external datasets based on the rat model. Additionally, a smaller (T384) and a larger version (T1500) of T1000 were used for dose-response modeling to test the effect of gene set size. Our findings demonstrated that the T1000 gene set is predictive of apical outcomes across a range of conditions (e.g., in vitro and in vivo, dose-response, multiple species, tissues, and chemicals), and generally performs as well, or better than other gene sets available.

show abstract

Section: Manuscript To Be Reviewedmentioning

confidence: 99%

Peer Review #1 of "T1000: a reduced gene set prioritized for toxicogenomic studies (v0.2)"

2019

View full text Add to dashboard Cite

show abstract

“…Nevertheless, these methods rely on arbitrary definitions of the pathways and functional sets, which might differ depending on the selected database, and tend to hide functional processes spanning several pathways [ 11 , 12 ]. Many studies [ 13 – 16 ] aim to take advantage of published gene expression data available in databases such as DrugMatrix [ 17 ], Connectivity Map [ 18 , 19 ], ToxicoDB [ 20 ] and Open TG-GATEs [ 21 ] ( https://toxico.nibiohn.go.jp ) to improve chemical toxicity assessment. For instance, Heusinkveld et al [ 22 ] implemented an approach based on the comparison of Open TG-GATEs top 50 DEG signatures ranked according to their t-statistic.…”

Section: Introductionmentioning

confidence: 99%

A strategy to detect metabolic changes induced by exposure to chemicals from large sets of condition-specific metabolic models computed with enumeration techniques

Fresnais,

Perin,

Riu

et al. 2024

BMC Bioinformatics

View full text Add to dashboard Cite

Background The growing abundance of in vitro omics data, coupled with the necessity to reduce animal testing in the safety assessment of chemical compounds and even eliminate it in the evaluation of cosmetics, highlights the need for adequate computational methodologies. Data from omics technologies allow the exploration of a wide range of biological processes, therefore providing a better understanding of mechanisms of action (MoA) related to chemical exposure in biological systems. However, the analysis of these large datasets remains difficult due to the complexity of modulations spanning multiple biological processes. Results To address this, we propose a strategy to reduce information overload by computing, based on transcriptomics data, a comprehensive metabolic sub-network reflecting the metabolic impact of a chemical. The proposed strategy integrates transcriptomic data to a genome scale metabolic network through enumeration of condition-specific metabolic models hence translating transcriptomics data into reaction activity probabilities. Based on these results, a graph algorithm is applied to retrieve user readable sub-networks reflecting the possible metabolic MoA (mMoA) of chemicals. This strategy has been implemented as a three-step workflow. The first step consists in building cell condition-specific models reflecting the metabolic impact of each exposure condition while taking into account the diversity of possible optimal solutions with a partial enumeration algorithm. In a second step, we address the challenge of analyzing thousands of enumerated condition-specific networks by computing differentially activated reactions (DARs) between the two sets of enumerated possible condition-specific models. Finally, in the third step, DARs are grouped into clusters of functionally interconnected metabolic reactions, representing possible mMoA, using the distance-based clustering and subnetwork extraction method. The first part of the workflow was exemplified on eight molecules selected for their known human hepatotoxic outcomes associated with specific MoAs well described in the literature and for which we retrieved primary human hepatocytes transcriptomic data in Open TG-GATEs. Then, we further applied this strategy to more precisely model and visualize associated mMoA for two of these eight molecules (amiodarone and valproic acid). The approach proved to go beyond gene-based analysis by identifying mMoA when few genes are significantly differentially expressed (2 differentially expressed genes (DEGs) for amiodarone), bringing additional information from the network topology, or when very large number of genes were differentially expressed (5709 DEGs for valproic acid). In both cases, the results of our strategy well fitted evidence from the literature regarding known MoA. Beyond these confirmations, the workflow highlighted potential other unexplored mMoA. Conclusion The proposed strategy allows toxicology experts to decipher which part of cellular metabolism is expected to be affected by the exposition to a given chemical. The approach originality resides in the combination of different metabolic modelling approaches (constraint based and graph modelling). The application to two model molecules shows the strong potential of the approach for interpretation and visual mining of complex omics in vitro data. The presented strategy is freely available as a python module (https://pypi.org/project/manamodeller/) and jupyter notebooks (https://github.com/LouisonF/MANA).

show abstract

“…Nevertheless, these methods rely on arbitrary definitions of the pathways and functional sets, which might differ depending on the selected database, and tend to hide functional processes spanning several pathways [11,12]. Many studies [13][14][15][16] aim to take advantage of published gene expression data available in databases such as DrugMatrix [17], Connectivity Map [18,19], ToxicoDB [20] and Open TG-GATEs [21] (https://toxico.nibiohn.go.jp) to improve chemical toxicity assessment. For instance, Heusinkveld et al [22] implemented an approach based on the comparison of Open TG-GATEs top 50 DEG signatures ranked according to their t-statistic.…”

Section: Introductionmentioning

confidence: 99%

A strategy to detect metabolic changes induced by exposure to chemicals from large sets of condition-specific metabolic models computed with enumeration techniques

Fresnais

Périn

Riu

et al. 2023

Preprint

View full text Add to dashboard Cite

The growing abundance of in vitro omics data, coupled with the necessity to reduce animal testing in the safety assessment of chemical compounds and even eliminate it in the evaluation of cosmetics, highlights the need for abundant computational methodologies. Data from omics technologies allow the exploration of a wide range of biological processes, therefore providing a better understanding of mechanisms of action (MoA) related to chemical exposure in biological systems. However, the analysis of these large datasets remains difficult due to the complexity of modulations spanning multiple biological processes. To address this, we propose a new computational workflow that combines knowledge on endogenous metabolism from a genome scale metabolic network (GSMN) and in vitro transcriptomics data with the aim of better identifying the metabolic MoA (mMoA) of chemicals. Our workflow proceeds in three main steps. The first step consists of building cell condition-specific models representing the metabolic impact of each exposure condition while taking into account the diversity of possible optimal solutions with a partial enumeration algorithm. In a second step, based on these enumerations, two conditions can be compared by extracting differentially activated reactions (DARs) between the two sets of enumerated possible condition-specific models. Finally, in the third step, DARs are grouped into clusters of functionally interconnected metabolic reactions using the distance-based clustering and subnetwork extraction method. The first part of the workflow was exemplified on eight molecules selected for their known human hepatotoxic outcomes associated with specific MoAs well described in the literature and for which we retrieved primary human hepatocytes (PHH) transcriptomic data in Open TG-GATEs. Then, we applied this new workflow to model and visualize associated mMoA for two of these eight molecules (amiodarone and valproic acid). Despite large disparities in transcriptomic effects for these two chemicals, i.e., two differentially expressed genes (DEGs) for amiodarone vs 5709 DEGs for valproic acid, our results well fitted evidence from the literature regarding known MoA. Beyond these confirmations, the workflow highlighted potential other unexplored mMoA.

show abstract

A Qualitative Modeling Approach for Whole Genome Prediction Using High-Throughput Toxicogenomics Data and Pathway-Based Validation

Cited by 10 publications

References 35 publications

Peer Review #1 of "T1000: a reduced gene set prioritized for toxicogenomic studies (v0.2)"

Peer Review #1 of "T1000: a reduced gene set prioritized for toxicogenomic studies (v0.2)"

A strategy to detect metabolic changes induced by exposure to chemicals from large sets of condition-specific metabolic models computed with enumeration techniques

A strategy to detect metabolic changes induced by exposure to chemicals from large sets of condition-specific metabolic models computed with enumeration techniques

Contact Info

Product

Resources

About