Sorghum bicolor is a drought tolerant C4 grass used for the production of grain, forage, sugar, and lignocellulosic biomass and a genetic model for C4 grasses due to its relatively small genome (approximately 800 Mbp), diploid genetics, diverse germplasm, and colinearity with other C4 grass genomes. In this study, deep sequencing, genetic linkage analysis, and transcriptome data were used to produce and annotate a high-quality reference genome sequence. Reference genome sequence order was improved, 29.6 Mbp of additional sequence was incorporated, the number of genes annotated increased 24% to 34 211, average gene length and N50 increased, and error frequency was reduced 10-fold to 1 per 100 kbp. Subtelomeric repeats with characteristics of Tandem Repeats in Miniature (TRIM) elements were identified at the termini of most chromosomes. Nucleosome occupancy predictions identified nucleosomes positioned immediately downstream of transcription start sites and at different densities across chromosomes. Alignment of more than 50 resequenced genomes from diverse sorghum genotypes to the reference genome identified approximately 7.4 M single nucleotide polymorphisms (SNPs) and 1.9 M indels. Large-scale variant features in euchromatin were identified with periodicities of approximately 25 kbp. A transcriptome atlas of gene expression was constructed from 47 RNA-seq profiles of growing and developed tissues of the major plant organs (roots, leaves, stems, panicles, and seed) collected during the juvenile, vegetative and reproductive phases. Analysis of the transcriptome data indicated that tissue type and protein kinase expression had large influences on transcriptional profile clustering. The updated assembly, annotation, and transcriptome data represent a resource for C4 grass research and crop improvement.
Sorghum is an important C4 grass crop grown for grain, forage, sugar, and bioenergy production. While tall, late flowering landraces are commonly grown in Africa, short early flowering varieties were selected in US grain sorghum breeding programs to reduce lodging and to facilitate machine harvesting. Four loci have been identified that affect stem length (Dw1-Dw4). Subsequent research showed that Dw3 encodes an ABCB1 auxin transporter and Dw1 encodes a highly conserved protein involved in the regulation of cell proliferation. In this study, Dw2 was identified by fine-mapping and further confirmed by sequencing the Dw2 alleles in Dwarf Yellow Milo and Double Dwarf Yellow Milo, the progenitor genotypes where the recessive allele of dw2 originated. The Dw2 locus was determined to correspond to Sobic.006G067700, a gene that encodes a protein kinase that is homologous to KIPK, a member of the AGCVIII subgroup of the AGC protein kinase family in Arabidopsis.
Sorghum bicolor is a drought-resilient facultative short-day C4 grass that is grown for grain, forage, and biomass. Adaptation of sorghum for grain production in temperate regions resulted in the selection of mutations in Maturity loci ( Ma 1 –Ma 6 ) that reduced photoperiod sensitivity and resulted in earlier flowering in long days. Prior studies identified the genes associated with Ma 1 ( PRR37 ), Ma 3 ( PHYB ), Ma 5 ( PHYC ) and Ma 6 ( GHD7 ) and characterized their role in the flowering time regulatory pathway. The current study focused on understanding the function and identity of Ma 2 . Ma 2 delayed flowering in long days by selectively enhancing the expression of SbPRR37 ( Ma 1 ) and SbCO , genes that co-repress the expression of SbCN12 , a source of florigen. Genetic analysis identified epistatic interactions between Ma 2 and Ma 4 and located QTL corresponding to Ma 2 on SBI02 and Ma 4 on SBI10. Positional cloning and whole genome sequencing identified a candidate gene for Ma 2 , Sobic.002G302700, which encodes a SET and MYND (SYMD) domain lysine methyltransferase. Eight sorghum genotypes previously identified as recessive for Ma 2 contained the mutated version of Sobic.002G302700 present in 80M ( ma 2 ) and one additional putative recessive ma 2 allele was identified in diverse sorghum accessions.
Sorghum bicolor is a drought-resilient facultative short-day C4 grass that is grown for grain, forage, and biomass. Adaptation of sorghum for grain production in temperate regions resulted in the selection of mutations in Maturity loci (Ma1 – Ma6) that reduced photoperiod sensitivity and resulted in earlier flowering in long days. Prior studies identified the genes associated with Ma1 (PRR37), Ma3 (PHYB), Ma5 (PHYC) and Ma6 (GHD7) and characterized their role in the flowering time regulatory pathway. The current study focused on understanding the function and identity of Ma2. Ma2 delayed flowering in long days by selectively enhancing the expression of SbPRR37 (Ma1) and SbCO, genes that co-repress the expression of SbCN12, a source of florigen. Genetic analysis identified epistatic interactions between Ma2 and Ma4 and located QTL corresponding to Ma2 on SBI02 and Ma4 on SBI10. Positional cloning and whole genome sequencing identified a candidate gene for Ma2, Sobic.002G302700, which encodes a SET and MYND (SYMD) domain lysine methyltransferase. Nine sorghum genotypes previously identified as recessive for Ma2 contained the mutated version of Sobic.002G302700 present in 80M (ma2).
2ABSTRACTSorghum bicolor is a drought tolerant C4 grass used for production of grain, forage, sugar, and lignocellulosic biomass and a genetic model for C4 grasses due to its relatively small genome (~800 Mbp), diploid genetics, diverse germplasm, and colinearity with other C4 grass genomes. In this study, deep sequencing, genetic linkage analysis, and transcriptome data were used to produce and annotate a high quality reference genome sequence. Reference genome sequence order was improved, 29.6 Mbp of additional sequence was incorporated, the number of genes annotated increased 24% to 34,211, average gene length and N50 increased, and error frequency was reduced 10-fold to 1 per 100 kbp. Sub-telomeric repeats with characteristics of Tandem Repeats In Miniature (TRIM) elements were identified at the termini of most chromosomes. Nucleosome occupancy predictions identified nucleosomes positioned immediately downstream of transcription start sites and at different densities across chromosomes. Alignment of the reference genome sequence to 56 resequenced genomes from diverse sorghum genotypes identified ~7.4M SNPs and 1.8M indels. Large scale variant features in euchromatin were identified with periodicities of ~25 kbp. An RNA transcriptome atlas of gene expression was constructed from 47 samples derived from growing and developed tissues of the major plant organs (roots, leaves, stems, panicles, seed) collected during the juvenile, vegetative and reproductive phases. Analysis of the transcriptome data indicated that tissue type and protein kinase expression had large influences on transcriptional profile clustering. The updated assembly, annotation, and transcriptome data represent a resource for C4 grass research and crop improvement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.