The integration of genomics and proteomics data (proteogenomics) holds the promise of furthering the in-depth understanding of human disease. However, sample mix-up is a pervasive problem in proteogenomics because of the complexity of sample processing. Here, we present a pipeline for Sample Matching in Proteogenomics (SMAP) to verify sample identity and ensure data integrity. SMAP infers sample-dependent protein-coding variants from quantitative mass spectrometry (MS), and aligns the MS-based proteomic samples with genomic samples by two discriminant scores. Theoretical analysis with simulated data indicates that SMAP is capable of uniquely matching proteomic and genomic samples when ≥20% genotypes of individual samples are available. When SMAP was applied to a large-scale dataset generated by the PsychENCODE BrainGVEX project, 54 samples (19%) were corrected. The correction was further confirmed by ribosome profiling and chromatin sequencing (ATAC-seq) data from the same set of samples. Our results demonstrate that SMAP is an effective tool for sample verification in a large-scale MS-based proteogenomics study. SMAP is publicly available at https://github.com/UND-Wanglab/SMAP, and a web-based version can be accessed at https://smap.shinyapps.io/smap/.
SP1 binding in SV40 chromatin in vitro and in vivo was characterized in order to better understand its role during the initiation of early transcription. We observed that chromatin from disrupted virions, but not minichromosomes, was efficiently bound by HIS-tagged SP1 in vitro, while the opposite was true for the presence of endogenous SP1 introduced in vivo. Using ChIP-Seq to compare the location of SP1 to nucleosomes carrying modified histones, we found that SP1 could occupy its whole binding site in virion chromatin but only the early side of its binding site in most of the minichromosomes carrying modified histones due to the presence of overlapping nucleosomes. The results suggest that during the initiation of an SV40 infection, SP1 binds to an open region in SV40 virion chromatin but quickly triggers chromatin reorganization and its own removal by a hit and run mechanism.
We have recently shown that at late times in an SV40 infection a nucleosome containing modified histones is positioned over the enhancer region consistent with repression of early transcription and activation of late transcription. However, this nucleosome appears to slide toward the late region during virion formation consistent with a role in the activation of early transcription and repression of late transcription in a subsequent infection. Because the transcription factor SP1 plays a major role in regulating SV40 early and late transcription and its binding should be sensitive to nucleosome positioning, we measured the ability of HIS‐tagged SP1 to bind to various forms of SV40 chromatin in vitro and compared this to the presence of SP1 in SV40 chromatin derived in vivo. We found that HIS‐tagged SP1 bound well to chromatin from virions and very poorly to other forms of SV40 chromatin in vitro, while cellular SP1 was associated with minichromosomes isolated from infected cells and was not found in chromatin from virions. In order to determine whether the location of nucleosomes and associated histone modifications plays a role in the activation and continuation of early transcription during the initiation of an infection, we have mapped the location of RNAPII and nucleosomes containing a number of histone modifications in chromatin from virions and minichromosomes isolated 30 minutes and 2 hours post‐infection using ChIP‐Seq. During the first two hours of infection we found that RNAPII first associates with the miRNA site in the SV40 genome (at 30 minutes post‐infection) and then shifts to the early regulatory region around the enhancer (2 hours post‐infection). We next compared the location of RNAPII at the two time points to the location of nucleosomes carrying one of nine different specific histone modifications in the chromatin from virions, and minichromosomes isolated at these times. Chromatin from virions was used because it is the substrate for the activation of transcription, while chromatin from the minichromosomes would indicate any changes occurring as a consequence of transcriptional activation or extension. Based upon this rationale, we identified acetylated H3 as the most likely histone modification in SV40 predicting subsequent binding by RNAPII. We also identified acetylated H4 and the shifting of nucleosomes away from the RNAPII binding site as changes which result from activation of transcription. Together these results indicate that positioned nucleosomes carrying certain histone modifications may control accessibility of transcription factors and also serve as marks for either the binding of RNAPII during initiation of transcription or subsequent re‐initiation during active transcription. Support or Funding Information NIH 1R03AI127969‐01NIH 1R03AI142011‐01
The location of nucleosomes in chromatin significantly impacts many biological processes including DNA replication, repair and gene expression. A number of techniques have been developed for mapping nucleosome locations in chromatin including MN-Seq (micrococcal nuclease digestion followed by next generation sequencing), ATAC-Seq (Tagamet chromatin fragmentation followed by next generation sequencing), and ChIP-Seq (chromatin immunoprecipitation and fragmentation followed by next generation sequencing). All of these techniques have been successfully used, but each with its own limitations. Recently, New England Biolabs has marketed a new kit, the NEBNext UltraII FS Library Prep kit, for preparing libraries for next generation sequencing from purified genomic DNA. This kit is based on a novel proprietary DNA fragmentation procedure which appears to cleave DNA that is not bound by proteins. Because DNA is fragmented directly in the FS kit, we tested whether the kit might also be useful for mapping the location of nucleosomes in chromatin. Using Simian Virus 40 (SV40) chromatin isolated at different times in an infection, we have compared nucleosome mapping using the NEB FS kit (FS-Seq) to MN-Seq, ATAC-Seq, and ChIP-Seq. Mapping nucleosomes using FS-Seq generated nucleosome profiles similar to those generated by ATAC-Seq and ChIP-Seq in regulatory regions of the SV40 genome. We conclude that FS-Seq is a simple, robust, cost-effective procedure for mapping nucleosomes in SV40 chromatin that should be useful for other forms of chromatin as well. We also present evidence that the FS kit may be useful for mapping the location of transcription factors in chromatin when sequencing reads between 75 and 99 base pairs in size are analyzed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.