Large scale data resources such as the NCI's Cancer Research Data Commons (CRDC) and the Genotype-Tissue Expression (GTEx) portal have the potential to simplify the analysis of cancer data by providing data that can be used as standards or controls. However, comparisons with data that is processed using different methodologies or even different versions of software, parameters and supporting datasets can lead to artefactual results. Reproducing the exact workflows from text-based standard operating procedures (SOPs) is problematic as the documentation can be incomplete or out of date, especially for complex workflows involving many executables and scripts. We extend the Biodepot-workflow-builder (Bwb) platform to distribute the computational methodology with integrated data access to the National Cancer Institute (NCI) Genomic Data Commons (GDC). We have converted the GDC DNA sequencing (DNA-Seq), the GDC mRNA-Seq SOPs into reproducible, self-installing, containerized graphical workflows that users can apply to their custom datasets. Secure access to CRDC data is provided using the Data Commons Framework Services (DCFS) Gen3 protocol. The user can perform the analysis on their laptop, desktop or use their preferred cloud provider to access the computational and network resources available on the cloud. We demonstrate the impact of non-uniform analysis of control and treatment data for the inference of differentially expressed genes. Most importantly, we also provide a dynamic and practical solution for uniform and reproducible reprocessing of omics data allowing cancer researchers to take full advantage across multiple data resources such as the CRDC and GTEx.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.