Improving end-use quality and disease resistance are important goals in wheat breeding. The genetic loci controlling these traits are highly complex, consisting of large families of prolamin and resistance genes with members present in all three homeologous A, B, and D genomes in hexaploid bread wheat. Here, orthologous regions harboring both prolamin and resistance gene loci were reconstructed and compared to understand gene duplication and evolution in different wheat genomes. Comparison of the two orthologous D regions from the hexaploid wheat Chinese Spring and the diploid progenitor Aegilops tauschii revealed their considerable difference due to the presence of five large structural variations with sizes ranging from 100 kb to 2 Mb. As a result, 44% of the Ae. tauschii and 71% of the Chinese Spring sequences in the analyzed regions, including 79 genes, are not shared. Gene rearrangement events, including differential gene duplication and deletion in the A, B, and D regions, have resulted in considerable erosion of gene collinearity in the analyzed regions, suggesting rapid evolution of prolamin and resistance gene families after the separation of the three wheat genomes. We hypothesize that this fast evolution is attributed to the co-evolution of the two gene families dispersed within a high recombination region. The identification of a full set of prolamin genes facilitated transcriptome profiling and revealed that the A genome contributes the least to prolamin expression because of its smaller number of expressed intact genes and their low expression levels, while the B and D genomes contribute similarly.
Background
Schistosoma japonicum
is a parasitic flatworm that causes human schistosomiasis, which is a significant cause of morbidity in China and the Philippines. A single draft genome was available for
S
.
japonicum
, yet this assembly is very fragmented and only covers 90% of the genome, which make it difficult to be applied as a reference in functional genome analysis and genes discovery.
Findings
In this study, we present a high-quality assembly of the fluke
S
.
japonicum
genome by combining 20 G (~53X) long single molecule real time sequencing reads with 80 G (~ 213X) Illumina paired-end reads. This improved genome assembly is approximately 370.5 Mb, with contig and scaffold N50 length of 871.9 kb and 1.09 Mb, representing 142.4-fold and 6.2-fold improvement over the released WGS-based assembly, respectively. Additionally, our assembly captured 85.2% complete and 4.6% partial eukaryotic Benchmarking Universal Single-Copy Orthologs. Repetitive elements account for 46.80% of the genome, and 10,089 of the protein-coding genes were predicted from the improved genome, of which 96.5% have been functionally annotated. Lastly, using the improved assembly, we identified 20 significantly expanded gene families in
S
.
japonicum
, and those genes were primarily enriched in functions of proteolysis and protein glycosylation.
Conclusions
Using the combination of PacBio and Illumina Sequencing technologies, we provided an improved high-quality genome of
S
.
japonicum
. This improved genome assembly, as well as the annotation, will be useful for the comparative genomics of the flukes and more importantly facilitate the molecular studies of this important parasite in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.