2018
DOI: 10.1099/mgen.0.000224
|View full text |Cite|
|
Sign up to set email alerts
|

mlplasmids: a user-friendly tool to predict plasmid- and chromosome-derived sequences for single species

Abstract: Assembly of bacterial short-read whole-genome sequencing data frequently results in hundreds of contigs for which the origin, plasmid or chromosome, is unclear. Complete genomes resolved by long-read sequencing can be used to generate and label short-read contigs. These were used to train several popular machine learning methods to classify the origin of contigs from Enterococcus faecium, Klebsiella pneumoniae and Escherichia coli using pentamer frequencies. We selected support-vector machine (SVM) models as t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
121
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 139 publications
(135 citation statements)
references
References 31 publications
2
121
0
Order By: Relevance
“…There was however a major discrepancy in one sample where a ~ 868kb region was called as chromosomal in library 3 and a circularised super-plasmid-like component in library 4 ( figure 4). As expected, ML plasmids [15] predicted with high confidence (97% probability) this contig was of chromosomal origin. Interestingly this error was fixed after filtering with Filtlong, suggesting it may have arisen from low quality reads.…”
Section: Reusing Flowcells For Similar Isolatessupporting
confidence: 81%
See 1 more Smart Citation
“…There was however a major discrepancy in one sample where a ~ 868kb region was called as chromosomal in library 3 and a circularised super-plasmid-like component in library 4 ( figure 4). As expected, ML plasmids [15] predicted with high confidence (97% probability) this contig was of chromosomal origin. Interestingly this error was fixed after filtering with Filtlong, suggesting it may have arisen from low quality reads.…”
Section: Reusing Flowcells For Similar Isolatessupporting
confidence: 81%
“…Minimap2 was used to map contigs from long-read to hybrid assemblies. ML plasmids [15] was used as a further arbitrator of the chromosomal/plasmid origin of sequences. Human reads were detected using Centrifuge [16] as part of the Crumpit [17] pipeline.…”
Section: Assembly Comparisonmentioning
confidence: 99%
“…Contigs were classified as chromosomal or plasmid-derived using mlplasmids given a probability threshold of 60% [22], with further screening for plasmid-related gene content using MARA, CARD and PlasmidFinder (Supplementary Table 2). The largest plasmid was a 156.3 Kb IncFIA one in VREC1073, its sole plasmid.…”
Section: Resultsmentioning
confidence: 99%
“…Assembly graphs were visualized with Bandage [33]. The resulting contigs in each assembly were classified as chromosomal or plasmid using machine learning algorithms implemented in mlplasmids [22].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation