Ahmed Ibrahim Samir Khalil scite author profile

Background: Detection of DNA copy number alterations (CNAs) is critical to understand genetic diversity, genome evolution and pathological conditions such as cancer. Cancer genomes are plagued with widespread multi-level structural aberrations of chromosomes that pose challenges to discover CNAs of different length scales, and distinct biological origins and functions. Although several computational tools are available to identify CNAs using read depth (RD) signal, they fail to distinguish between large-scale and focal alterations due to inaccurate modeling of the RD signal of cancer genomes. Additionally, RD signal is affected by overdispersion-driven biases at low coverage, which significantly inflate false detection of CNA regions. Results: We have developed CNAtra framework to hierarchically discover and classify 'large-scale' and 'focal' copy number gain/loss from a single whole-genome sequencing (WGS) sample. CNAtra first utilizes a multimodal-based distribution to estimate the copy number (CN) reference from the complex RD profile of the cancer genome. We implemented Savitzky-Golay smoothing filter and Modified Varri segmentation to capture the change points of the RD signal. We then developed a CN state-driven merging algorithm to identify the large segments with distinct copy numbers. Next, we identified focal alterations in each large segment using coveragebased thresholding to mitigate the adverse effects of signal variations. Using cancer cell lines and patient datasets, we confirmed CNAtra's ability to detect and distinguish the segmental aneuploidies and focal alterations. We used realistic simulated data for benchmarking the performance of CNAtra against other singlesample detection tools, where we artificially introduced CNAs in the original cancer profiles. We found that CNAtra is superior in terms of precision, recall and f-measure. CNAtra shows the highest sensitivity of 93 and 97% for detecting large-scale and focal alterations respectively. Visual inspection of CNAs revealed that CNAtra is the most robust detection tool for low-coverage cancer data.

show abstract

Identification and utilization of copy number information for correcting Hi-C contact map of cancer cell lines

Khalil

Muzaki

Chattopadhyay

et al. 2020

BMC Bioinformatics

View full text Add to dashboard Cite

Background Hi-C and its variant techniques have been developed to capture the spatial organization of chromatin. Normalization of Hi-C contact map is essential for accurate modeling and interpretation of high-throughput chromatin conformation capture (3C) experiments. Hi-C correction tools were originally developed to normalize systematic biases of karyotypically normal cell lines. However, a vast majority of available Hi-C datasets are derived from cancer cell lines that carry multi-level DNA copy number variations (CNVs). CNV regions display over- or under-representation of interaction frequencies compared to CN-neutral regions. Therefore, it is necessary to remove CNV-driven bias from chromatin interaction data of cancer cell lines to generate a euploid-equivalent contact map. Results We developed the HiCNAtra framework to compute high-resolution CNV profiles from Hi-C or 3C-seq data of cancer cell lines and to correct chromatin contact maps from systematic biases including CNV-associated bias. First, we introduce a novel ‘entire-fragment’ counting method for better estimation of the read depth (RD) signal from Hi-C reads that recapitulates the whole-genome sequencing (WGS)-derived coverage signal. Second, HiCNAtra employs a multimodal-based hierarchical CNV calling approach, which outperformed OneD and HiNT tools, to accurately identify CNVs of cancer cell lines. Third, incorporating CNV information with other systematic biases, HiCNAtra simultaneously estimates the contribution of each bias and explicitly corrects the interaction matrix using Poisson regression. HiCNAtra normalization abolishes CNV-induced artifacts from the contact map generating a heatmap with homogeneous signal. When benchmarked against OneD, CAIC, and ICE methods using MCF7 cancer cell line, HiCNAtra-corrected heatmap achieves the least 1D signal variation without deforming the inherent chromatin interaction signal. Additionally, HiCNAtra-corrected contact frequencies have minimum correlations with each of the systematic bias sources compared to OneD’s explicit method. Visual inspection of CNV profiles and contact maps of cancer cell lines reveals that HiCNAtra is the most robust Hi-C correction tool for ameliorating CNV-induced bias. Conclusions HiCNAtra is a Hi-C-based computational tool that provides an analytical and visualization framework for DNA copy number profiling and chromatin contact map correction of karyotypically abnormal cell lines. HiCNAtra is an open-source software implemented in MATLAB and is available at https://github.com/AISKhalil/HiCNAtra.

show abstract

Hierarchical Discovery of Large-scale and Focal Copy Number Alterations in Low-coverage Cancer Genomes

Khalil

Khyriem

Chattopadhyay

et al. 2019

Preprint

View full text Add to dashboard Cite

AbstractMotivationDetection of copy number alterations (CNA) is critical to understand genetic diversity, genome evolution and pathological conditions such as cancer. Cancer genomes are plagued with widespread multi-level structural aberrations of chromosomes that pose challenges to discover CNAs of different length scales with distinct biological origin and function. Although several tools are available to identify CNAs using read depth (RD) of coverage, they fail to distinguish between large-scale and focal alterations due to inaccurate modeling of the RD signal of cancer genomes. These tools are also affected by RD signal variations, pronounced in low-coverage data, which significantly inflate false detection of change points and inaccurate CNA calling.ResultsWe have developed CNAtra to hierarchically discover and classify ‘large-scale’ and ‘focal’ copy number gain/loss from whole-genome sequencing (WGS) data. CNAtra provides an analytical and visualization framework for CNV profiling using single sequencing sample. CNAtra first utilizes multimodal distribution to estimate the copy number (CN) reference from the complex RD profile of the cancer genome. We utilized Savitzy-Golay filter and Modified Varri segmentation to capture the change points. We then developed a CN state-driven merging algorithm to identify the large segments with distinct copy number. Next, focal alterations were identified in each large segment using coverage-based thresholding to mitigate the adverse effects of signal variations. We tested CNAtra calls using experimentally verified segmental aneuploidies and focal alterations which confirmed CNAtra’s ability to detect and distinguish the two alteration phenomena. We used realistic simulated data for benchmarking the performance of CNAtra against other detection tools where we artificially spiked-in CNAs in the original cancer profiles. We found that CNAtra is superior in terms of precision, recall, and f-measure. CNAtra shows the highest sensitivity of 93% and 97% for detecting focal and large-scale alterations respectively. Visual inspection of CNAs showed that CNAtra is the most robust detection tool for low-coverage cancer data.Availability and implementationCNAtra is an open source software implemented in MATLAB, and is available at https://github.com/AISKhalil/CNAtra

show abstract

Identification and Utilization of Copy Number Information for Correcting Hi-C Contact Map of Cancer Cell Line

Khalil

Muzaki

Chattopadhyay

et al. 2019

Preprint

View full text Add to dashboard Cite

1 Motivation 2Hi-C and its variant techniques have been developed to capture the spatial organization of 3 chromatin. Normalization of Hi-C contact maps is essential for accurate modeling and 4 interpretation of genome-wide chromatin conformation. Most Hi-C correction methods are 5 originally developed for normal cell lines and mainly target systematic biases. In contrast, 6 cancer genomes carry multi-level copy number variations (CNVs). Copy number influences 7 interaction frequency between genomic loci. Therefore, CNV-driven bias needs to be 8 corrected for generating euploid-equivalent chromatin contact maps. 9 Results 10We developed HiCNAtra framework that extracts read depth (RD) signal from Hi-C or 3C-11 seq reads to generate the high-resolution CNV profile and use this information to correct the 12 contact map. We proposed the "entire restriction fragment" counting for better estimation of 13 the RD signal and generation of CNV profiles. HiCNAtra integrates CNV information along 14 with other systematic biases for explicitly correcting the interaction matrix using Poisson 15 regression model. We demonstrated that RD estimation of HiCNAtra recapitulates the whole-16 genome sequencing (WGS)-derived coverage signal of the same cell line. Benchmarking 17 against OneD method (only explicit method to target CNV bias) showed that HiCNAtra fared 18 better in eliminating the impact of CNV on the contact maps. 19 Availability and implementation 20HiCNAtra is an open source software implemented in MATLAB and is available at 21 https://github.com/AISKhalil/HiCNAtra. 22

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.