Background
Blackgram [
Vigna mungo
(L.) Hepper], is an important legume crop of Asia with limited genomic resources. We report a comprehensive set of genic simple sequence repeat (SSR) and single nucleotide polymorphism (SNPs) markers using Illumina MiSeq sequencing of transcriptome and its application in genetic variation analysis and mapping.
Results
Transcriptome sequencing of immature seeds of wild blackgram,
V. mungo
var
. silvestris
by Illumina MiSeq technology generated 1.9 × 10
7
reads, which were assembled into 40,178 transcripts (TCS) with an average length of 446 bp covering 2.97 GB of the genome. A total of 38,753 CDS (Coding sequences) were predicted from 40,178 TCS and 28,984 CDS were annotated through BLASTX and mapped to GO and KEGG database resulting in 140 unique pathways. The tri-nucleotides were most abundant (39.9%) followed by di-nucleotide (30.2%). About 60.3 and 37.6% of SSR motifs were present in the coding sequences (CDS) and untranslated regions (UTRs) respectively. Among SNPs, the most abundant substitution type were transitions (Ts) (61%) followed by transversions (Tv) type (39%), with a Ts/Tv ratio of 1.58. A total of 2306 DEGs were identified by RNA Seq between wild and cultivar and validation was done by quantitative reverse transcription polymerase chain reaction. In this study, we genotyped SNPs with a validation rate of 78.87% by High Resolution Melting (HRM) Assay.
Conclusion
In the present study, 1621genic-SSR and 1844 SNP markers were developed from immature seed transcriptome sequence of blackgram and 31 genic-SSR markers were used to study genetic variations among different blackgram accessions. Above developed markers contribute towards enriching available genomic resources for blackgram and aid in breeding programmes.
Electronic supplementary material
The online version of this article (10.1186/s12870-019-1954-0) contains supplementary material, which is available to authorized users.
Blackgram [Vigna mungo (L.) Hepper] (2n = 2x = 22), an important Asiatic legume crop, is a major source of dietary protein for the predominantly vegetarian population. Here we construct a draft genome sequence of blackgram, for the first time, by employing hybrid genome assembly with Illumina reads and third generation Oxford Nanopore sequencing technology. The final de novo whole genome of blackgram is ~ 475 Mb (82% of the genome) and has maximum scaffold length of 6.3 Mb with scaffold N50 of 1.42 Mb. Genome analysis identified 42,115 genes with mean coding sequence length of 1131 bp. Around 80.6% of predicted genes were annotated. Nearly half of the assembled sequence is composed of repetitive elements with retrotransposons as major (47.3% of genome) transposable elements, whereas, DNA transposons made up only 2.29% of the genome. A total of 166,014 SSRs, including 65,180 compound SSRs, were identified and primer pairs for 34,816 SSRs were designed. Out of the 33,959 proteins, 1659 proteins showed presence of R-gene related domains. KIN class was found in majority of the proteins (905) followed by RLK (239) and RLP (188). The genome sequence of blackgram will facilitate identification of agronomically important genes and accelerate the genetic improvement of blackgram.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.