Germ cells are vital for transmitting genetic information from one generation to the next and for maintaining the continuation of species. Here, we analyze the transcriptome of human primordial germ cells (PGCs) from the migrating stage to the gonadal stage at single-cell and single-base resolutions. Human PGCs show unique transcription patterns involving the simultaneous expression of both pluripotency genes and germline-specific genes, with a subset of them displaying developmental-stage-specific features. Furthermore, we analyze the DNA methylome of human PGCs and find global demethylation of their genomes. Approximately 10 to 11 weeks after gestation, the PGCs are nearly devoid of any DNA methylation, with only 7.8% and 6.0% of the median methylation levels in male and female PGCs, respectively. Our work paves the way toward deciphering the complex epigenetic reprogramming of the germline with the aim of restoring totipotency in fertilized oocytes.
The protein design problem is to identify an amino acid sequence that folds to a desired structure. Given Anfinsen’s thermodynamic hypothesis of folding, this can be recast as finding an amino acid sequence for which the desired structure is the lowest energy state. As this calculation involves not only all possible amino acid sequences but also, all possible structures, most current approaches focus instead on the more tractable problem of finding the lowest-energy amino acid sequence for the desired structure, often checking by protein structure prediction in a second step that the desired structure is indeed the lowest-energy conformation for the designed sequence, and typically discarding a large fraction of designed sequences for which this is not the case. Here, we show that by backpropagating gradients through the transform-restrained Rosetta (trRosetta) structure prediction network from the desired structure to the input amino acid sequence, we can directly optimize over all possible amino acid sequences and all possible structures in a single calculation. We find that trRosetta calculations, which consider the full conformational landscape, can be more effective than Rosetta single-point energy estimations in predicting folding and stability of de novo designed proteins. We compare sequence design by conformational landscape optimization with the standard energy-based sequence design methodology in Rosetta and show that the former can result in energy landscapes with fewer alternative energy minima. We show further that more funneled energy landscapes can be designed by combining the strengths of the two approaches: the low-resolution trRosetta model serves to disfavor alternative states, and the high-resolution Rosetta model serves to create a deep energy minimum at the design target structure.
The high-order chromatin structure plays a non-negligible role in gene regulation. However, the mechanism, especially the sequence dependence for the formation of varied chromatin structures in different cells remains to be elucidated. As the nucleotide distributions in human and mouse genomes are highly uneven, we identified CGI (CpG island) forest and prairie genomic domains based on CGI densities of a species, dividing the genome into two sequentially, epigenetically, and transcriptionally distinct regions. These two megabase-sized domains also spatially segregate to different extents in different cell types. Forests and prairies show enhanced segregation from each other in development, differentiation, and senescence, meanwhile the multi-scale forest-prairie spatial intermingling is cell-type specific and increases in differentiation, helping to define cell identity. We propose that the phase separation of the 1D mosaic sequence in space serves as a potential driving force, and together with cell type specific epigenetic marks and transcription factors, shapes the chromatin structure in different cell types. The mosaicity in genome of different species in terms of forests and prairies could relate to observations in their biological processes like development and aging. In this way, we provide a bottoms-up theory to explain the chromatin structural and epigenetic changes in different processes.
The high-order chromatin structure plays a non-negligible role in gene regulation. However, the mechanism for the formation of different chromatin structures in different cells and the sequence dependence of this process remain to be elucidated. As the nucleotide distributions in human and mouse genomes are highly uneven, we identified CGI forest and prairie genomic domains based on CGI density, which better segregates genomic elements along the genome than GC content. The genome is then divided into two sequentially, epigenetically, and transcriptionally distinct regions.These two types of megabase-sized domains spatially segregate, but to a different extent in different cell types. Overall, the forests and prairies gradually segregate from each other in development, differentiation, and senescence. The multi-scale forest-prairie spatial intermingling is cell-type specific and increases in differentiation, thus helps define the cell identity. We propose that the phase separation of the 1D mosaic sequence in space, serving as a potential driving force, together with cell type specific epigenetic marks and transcription factors, shapes the chromatin structure in different cell types and renders them distinct genomic properties. The mosaicity of the genome manifested in terms of alternative forests and prairies of a species could be related to its biological All rights reserved. No reuse allowed without permission.(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.The copyright holder for this preprint . http://dx.doi.org/10.1101/255174 doi: bioRxiv preprint first posted online Jan. 28, 2018; 2 processes such as differentiation, aging and body temperature control.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.