The genomes and transcriptomes of hundreds of insects have been sequenced. However, insect community lacks an integrated, up-to-date collection of insect gene data. Here, we introduce the first release of InsectBase, available online at http://www.insect-genome.com. The database encompasses 138 insect genomes, 116 insect transcriptomes, 61 insect gene sets, 36 gene families of 60 insects, 7544 miRNAs of 69 insects, 96 925 piRNAs of Drosophila melanogaster and Chilo suppressalis, 2439 lncRNA of Nilaparvata lugens, 22 536 pathways of 78 insects, 678 881 untranslated regions (UTR) of 84 insects and 160 905 coding sequences (CDS) of 70 insects. This release contains over 12 million sequences and provides search functionality, a BLAST server, GBrowse, insect pathway construction, a Facebook-like network for the insect community (iFacebook), and phylogenetic analysis of selected genes.
BackgroundPiwi-interacting RNAs (piRNAs) are a class of small non-coding RNA primarily expressed in germ cells that can silence transposons at the post-transcriptional level. Accurate prediction of piRNAs remains a significant challenge.ResultsWe developed a program for piRNA annotation (Piano) using piRNA-transposon interaction information. We downloaded 13,848 Drosophila piRNAs and 261,500 Drosophila transposons. The piRNAs were aligned to transposons with a maximum of three mismatches. Then, piRNA-transposon interactions were predicted by RNAplex. Triplet elements combining structure and sequence information were extracted from piRNA-transposon matching/pairing duplexes. A support vector machine (SVM) was used on these triplet elements to classify real and pseudo piRNAs, achieving 95.3 ± 0.33% accuracy and 96.0 ± 0.5% sensitivity. The SVM classifier can be used to correctly predict human, mouse and rat piRNAs, with overall accuracy of 90.6%. We used Piano to predict piRNAs for the rice stem borer, Chilo suppressalis, an important rice insect pest that causes huge yield loss. As a result, 82,639 piRNAs were predicted in C. suppressalis.ConclusionsPiano demonstrates excellent piRNA prediction performance by using both structure and sequence features of transposon-piRNAs interactions. Piano is freely available to the academic community at http://ento.njau.edu.cn/Piano.html.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-014-0419-6) contains supplementary material, which is available to authorized users.
ChiloDB is an integrated resource that will be of use to the rice stem borer research community. The rice striped stem borer (SSB), Chilo suppressalis Walker, is a major rice pest that causes severe yield losses in most rice-producing countries. A draft genome of this insect is available. The aims of ChiloDB are (i) to store recently acquired genomic sequence and transcriptome data and integrate them with protein-coding genes, microRNAs, piwi-interacting RNAs (piRNAs) and RNA sequencing (RNA-Seq) data and (ii) to provide comprehensive search tools and downloadable data sets for comparative genomics and gene annotation of this important rice pest. ChiloDB contains the first version of the official SSB gene set, comprising 80 479 scaffolds and 10 221 annotated protein-coding genes. Additionally, 262 SSB microRNA genes predicted from a small RNA library, 82 639 piRNAs identified using the piRNApredictor software, 37 040 transcripts from a midgut transcriptome and 69 977 transcripts from a mixed sample have all been integrated into ChiloDB. ChiloDB was constructed using a data structure that is compatible with data resources, which will be incorporated into the database in the future. This resource will serve as a long-term and open-access database for research on the biology, evolution and pest control of SSB. To the best of our knowledge, ChiloDB is one of the first genomic and transcriptome database for rice insect pests.Database URL: http://ento.njau.edu.cn/ChiloDB.
Insects are one of the largest classes of animals on Earth and constitute more than half of all living species. The i5k initiative has begun sequencing of more than 5,000 insect genomes, which should greatly help in exploring insect resource and pest control. Insect genome annotation remains challenging because many insects have high levels of heterozygosity. To improve the quality of insect genome annotation, we developed a pipeline, named Optimized Maker-Based Insect Genome Annotation (OMIGA), to predict protein-coding genes from insect genomes. We first mapped RNA-Seq reads to genomic scaffolds to determine transcribed regions using Bowtie, and the putative transcripts were assembled using Cufflink. We then selected highly reliable transcripts with intact coding sequences to train de novo gene prediction software, including Augustus. The re-trained software was used to predict genes from insect genomes. Exonerate was used to refine gene structure and to determine near exact exon/intron boundary in the genome. Finally, we used the software Maker to integrate data from RNA-Seq, de novo gene prediction, and protein alignment to produce an official gene set. The OMIGA pipeline was used to annotate the draft genome of an important insect pest, Chilo suppressalis, yielding 12,548 genes. Different strategies were compared, which demonstrated that OMIGA had the best performance. In summary, we present a comprehensive pipeline for identifying genes in insect genomes that can be widely used to improve the annotation quality in insects. OMIGA is provided at http://ento.njau.edu.cn/omiga.html .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.