2018
DOI: 10.1007/978-1-4939-8775-7_6
|View full text |Cite
|
Sign up to set email alerts
|

Using BUSCO to Assess Insect Genomic Resources

Abstract: The increasing affordability of sequencing technologies offers many new and exciting opportunities to address a diverse array of biological questions. This is evidenced in entomological research by numerous genomics and transcriptomics studies that attempt to decipher the often complex relationships amongst different species or orders and to build 'omics' resources to drive advancement of the molecular understanding of insect biology. Being able to gauge the quality of the sequencing data is of critical import… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
25
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 31 publications
(27 citation statements)
references
References 28 publications
2
25
0
Order By: Relevance
“…A corresponding protein set was created using TransDecoder [42] with the default parameters. In order to obtain a unigene set (one amino acid sequence per gene), isoforms were grouped at the gene (g) level, because grouping at the contig (c) level reduced the number of conserved, single-copy genes measured by BUSCO ( [43] ; Figure S2; Table S10). With this protein set, CD-HIT [44] was used in order to remove peptides that are entirely contained within larger proteins with default word size and sequence identity threshold of 1.0.…”
Section: Transcriptome Processing and Annotationmentioning
confidence: 99%
“…A corresponding protein set was created using TransDecoder [42] with the default parameters. In order to obtain a unigene set (one amino acid sequence per gene), isoforms were grouped at the gene (g) level, because grouping at the contig (c) level reduced the number of conserved, single-copy genes measured by BUSCO ( [43] ; Figure S2; Table S10). With this protein set, CD-HIT [44] was used in order to remove peptides that are entirely contained within larger proteins with default word size and sequence identity threshold of 1.0.…”
Section: Transcriptome Processing and Annotationmentioning
confidence: 99%
“…Candidates identified in either the D. melanogaster or H. sapiens search were considered as candidate SLCs for that species. As the proteomes of some nonmodel arthropods may contain many fragmented genes, we estimated the completeness of each proteome using BUSCO ( Waterhouse et al. 2019 ).…”
Section: Methodsmentioning
confidence: 99%
“…The AalbS3 assembly more than doubled the content of transposable elements and other repetitive sequences (Table S2, File S2). The QV score (Berlin et al 2015, Solarese et al 2018 Zdobnov, 2019, Waterhouse et al, 2019) and scored a 99.0% with 98.2% of genes represented in this genome as complete and single copy and 0.8% as complete and duplicated (Table 1 and Table S3). By mapping Illumina reads from each of the parents to the AalbS3 assembly, the levels of heterozygosity were also calculated for each chromosome, which range from 0.0004 to 0.0011 (Table S4, Figure S1).…”
Section: Assembly Of the Anopheles Albimanus Genomementioning
confidence: 99%