2021
DOI: 10.1186/s12859-021-04038-2
|View full text |Cite
|
Sign up to set email alerts
|

ContigExtender: a new approach to improving de novo sequence assembly for viral metagenomics data

Abstract: Background Metagenomics is the study of microbial genomes for pathogen detection and discovery in human clinical, animal, and environmental samples via Next-Generation Sequencing (NGS). Metagenome de novo sequence assembly is a crucial analytical step in which longer contigs, ideally whole chromosomes/genomes, are formed from shorter NGS reads. However, the contigs generated from the de novo assembly are often very fragmented and rarely longer than a few kilo base pairs (kb). Therefore, a time-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(13 citation statements)
references
References 52 publications
0
13
0
Order By: Relevance
“…Additional The terminal inverted repeats of the PgVV-14T were represented as separate fragments in the spades assembly and thus the following strategy was utilized. The PgVV scaffold was trimmed to include only the non-repeated region and extended with ContigExtender using the raw data 55 . The scaffold was trimmed to include a minimal region that would contain the fragments.…”
Section: Pgvv-type Plvs Core Genes To Create a Framework To Uniformly...mentioning
confidence: 99%
See 1 more Smart Citation
“…Additional The terminal inverted repeats of the PgVV-14T were represented as separate fragments in the spades assembly and thus the following strategy was utilized. The PgVV scaffold was trimmed to include only the non-repeated region and extended with ContigExtender using the raw data 55 . The scaffold was trimmed to include a minimal region that would contain the fragments.…”
Section: Pgvv-type Plvs Core Genes To Create a Framework To Uniformly...mentioning
confidence: 99%
“…Long viral segments of at least 6000 bp were extracted by searching for MCP genes with the HMM profile for PLVs and viropages 9 and the NCLDV-specific profile VOG01840 from VOGDB (http://vogdb.org/). The extracted scaffolds were extended with Contig Extender v. 0.1 55 and then polished and gap-filled with pilon v. 1.24 74 . (46), Alveolata (11) and Haptophyta (10).…”
Section: Lc-msmentioning
confidence: 99%
“…56 Regarding the second issue, it is often challenging to obtain full-length viral genomes due to the low abundance of viral sequence reads in public HTS data (Box 1). Several approaches would be useful to obtain longer viral sequences: (i) developing more efficient sequencing assembly methods, 48,57 (ii) performing coassembly analysis using combined HTS data from samples considered to be infected with the same viruses, 58,59 or (iii) using long-read HTS data. 60 Second, it is difficult to accurately link virus-host relationships, even if viral sequences are identified in HTS data.…”
Section: Challenges Of Virus Searches Using Public Hts-related Datamentioning
confidence: 99%
“…Metagenomic sequences can contain multiple species and sequence types. However, metagenomic assemblies from short reads (~100 to 300 bp) are usually fragmented due to presence of closely related strains, repeated sequences, and shared sequences between bacterial species ( 8 11 ). This fragmentation makes the identification of plasmid and bacteriophage DNA from assemblies challenging.…”
Section: Introductionmentioning
confidence: 99%