SUMMARY Soybean (Glycine max [L.] Merr.) is a major crop in animal feed and human nutrition, mainly for its rich protein and oil contents. The remarkable rise in soybean transcriptome studies over the past 5 years generated an enormous amount of RNA‐seq data, encompassing various tissues, developmental conditions and genotypes. In this study, we have collected data from 1298 publicly available soybean transcriptome samples, processed the raw sequencing reads and mapped them to the soybean reference genome in a systematic fashion. We found that 94% of the annotated genes (52 737/56 044) had detectable expression in at least one sample. Unsupervised clustering revealed three major groups, comprising samples from aerial, underground and seed/seed‐related parts. We found 452 genes with uniform and constant expression levels, supporting their roles as housekeeping genes. On the other hand, 1349 genes showed heavily biased expression patterns towards particular tissues. A transcript‐level analysis revealed that 95% (70 963 of 74 490) of the assembled transcripts have intron chains exactly matching those from known transcripts, whereas 3256 assembled transcripts represent potentially novel splicing isoforms. The dataset compiled here constitute a new resource for the community, which can be downloaded or accessed through a user‐friendly web interface at http://venanciogroup.uenf.br/resources/. This comprehensive transcriptome atlas will likely accelerate research on soybean genetics and genomics.
X-chromosome inactivation (XCI) is the epigenetic transcriptional silencing of an X-chromosome during the early stages of embryonic development in female eutherian mammals. XCI assures monoallelic expression in each cell and compensation for dosage-sensitive X-linked genes between females (XX) and males (XY). DNA methylation at the carbon-5 position of the cytosine pyrimidine ring in the context of a CpG dinucleotide sequence (5meCpG) in promoter regions is a key epigenetic marker for transcriptional gene silencing. Using computational analysis, we revealed an extragenic tandem GAAA repeat 230-bp from the landmark CpG island of the human X-linked retinitis pigmentosa 2 RP2 promoter whose 5meCpG status correlates with XCI. We used this RP2 onshore tandem GAAA repeat to develop an allele-specific 5meCpG-based PCR assay that is highly concordant with the human androgen receptor (AR) exonic tandem CAG repeat-based standard HUMARA assay in discriminating active (Xa) from inactive (Xi) X-chromosomes. The RP2 onshore tandem GAAA repeat contains neutral features that are lacking in the AR disease-linked tandem CAG repeat, is highly polymorphic (heterozygosity rates approximately 0.8) and shows minimal variation in the Xa/Xi ratio. The combined informativeness of RP2/AR is approximately 0.97, and this assay excels at determining the 5meCpG status of alleles at the Xp (RP2) and Xq (AR) chromosome arms in a single reaction. These findings are relevant and directly translatable to nonhuman primate models of XCI in which the AR CAG-repeat is monomorphic. We conducted the RP2 onshore tandem GAAA repeat assay in the naturally occurring chimeric New World monkey marmoset (Callitrichidae) and found it to be informative. The RP2 onshore tandem GAAA repeat will facilitate studies on the variable phenotypic expression of dominant and recessive X-linked diseases, epigenetic changes in twins, the physiology of aging hematopoiesis, the pathogenesis of age-related hematopoietic malignancies and the clonality of cancers in human and nonhuman primates.
Haemophilia A, the most common severe hereditary bleeding disorder in humans, is chiefly caused by mutations in the coagulation factor VIII F8 gene, which maps on chromosome band Xq28. Linkage analysis with F8 intragenic and/or closely located extragenic short tandem repeat (STR) elements is an effective method for indirect tracing of F8 pathogenic allelic variants in at-risk families. STR profiling is currently limited to 14 markers, 11 of which are dinucleotide elements. The aim of this study was to define novel polymorphic STR loci for haemophilia A carrier screening. The combined linkage physical map was restricted to a 2.4-Mb region on Xq28 that hosts 81 annotated genes, F8 inclusive. The inventory was in silico through comparative analyses with three X chromosome reference sequences, using microsatellite mining and validation computer software. Genetic distances for unmapped markers were interpolated on the Rutgers map of the human genome. The effort yielded 94 STR loci: 53 extragenic and 41 intragenic (14 STR elements map on the F8 gene; the other 27 on 19 further genes). The distribution per repeat period size was 61.7% di-, 5.3% tri-, 26.6% tetra- and 6.4% pentanucleotide loci. The success rate of validation of polymorphism for the new STR loci was 56.3%. For STR elements 0.78 Mb equidistant of the F8 gene, the interpolated downstream genetic length is 3.27 times the upstream genetic length. The inventory represents a 5.7-fold increase in polymorphic STR loci useful in carrier detection. Genotyping with the upstream extragenic tetra- and pentanucleotide markers is thus apprized.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.