Ayse Ergun scite author profile

Identifying and quantifying the microbial composition of a complex biological or environmental sample is one of the primary challenges in microbiology. Many software tools have been developed to classify metagenomic sequencing reads originating from a mixture of bacterial or viral genomes, and to estimate the microbial abundance profile of the mixture. Unfortunately the accuracy of these tools significantly degrade in the presence of large portions of shared content among the genomes in the mixture or the genomic database in use. Here we introduce CAMMiQ, a novel combinatorial solution to the microbial identification and abundance estimation problem, which improves all available tools with respect to the number of correctly classified reads (i.e., specificity) by an order of magnitude and resolves possible mixtures of similar genomes, possibly at the strain level. The key contribution of CAMMiQ is its use of arbitrary length, doubly-unique substrings, i.e. substrings that appear in exactly two genomes in the input database, instead of fixed-length, unique substrings. In order to resolve the ambiguity in the genomic origin of doubly-unique substrings, CAMMiQ employs a combinatorial optimization formulation, which can be solved surprisingly quickly. CAMMiQ's index consists of a sparsified subset of the shortest unique and doubly-unique substrings of each genome in the database, within a user specified length range and as such it is fairly compact. In short, CAMMiQ offers more accurate genomic identification and abundance estimation than the best known k-mer based and marker gene based alternatives through the use of comparable computational resources. Availability: https://github.com/algo-cancer/CAMMiQ

show abstract

Strain level microbial detection and quantification with applications to single cell metagenomics

Zhu

Schäffer

Robinson

et al. 2022

Nat Commun

View full text Add to dashboard Cite

Computational identification and quantification of distinct microbes from high throughput sequencing data is crucial for our understanding of human health. Existing methods either use accurate but computationally expensive alignment-based approaches or less accurate but computationally fast alignment-free approaches, which often fail to correctly assign reads to genomes. Here we introduce CAMMiQ, a combinatorial optimization framework to identify and quantify distinct genomes (specified by a database) in a metagenomic dataset. As a key methodological innovation, CAMMiQ uses substrings of variable length and those that appear in two genomes in the database, as opposed to the commonly used fixed-length, unique substrings. These substrings allow to accurately decouple mixtures of highly similar genomes resulting in higher accuracy than the leading alternatives, without requiring additional computational resources, as demonstrated on commonly used benchmarking datasets. Importantly, we show that CAMMiQ can distinguish closely related bacterial strains in simulated metagenomic and real single-cell metatranscriptomic data.

show abstract

A pipeline to detect the relationship between transposable elements and adjacent genes in host genomes

Meguerditchian

Ergun

Decroocq

et al. 2021

Preprint

View full text Add to dashboard Cite

Understanding the relationship between transposable elements (TEs) and their associated genes in the host genome is a key point to explore their potential role in genome evolution. Transposable elements can regulate and affect gene expression not only because of their mobility within the genome but also because of its transcriptional activity. Gene expression can be suppressed, decreased or increased and cellular signalling pathways can be activated through the act of the nearby TE expression itself or subsequent TE replication intermediates. We implemented a pipeline which is capable to reveal the relationship between TEs and adjacent gene distribution in the host genome. Our tool is freely available here : https://github.com/marieBvr/TEs_genes_relationship_pipeline

show abstract

Pipeline to detect the positional and directional relationship between transposable elements and adjacent genes in host genome

Meguerditchian¹,

Ergun²,

Decroocq³

et al. 2021

View full text Add to dashboard Cite

Understanding the relationship between transposable elements (TEs) and their closest positional genes in the host genome is a key point to explore their potential role in genome evolution. Transposable elements can regulate and affect gene expression not only because of their mobility within the genome but also because of their transcriptional activity. A comprehensive knowledge of structural organization between transposable elements and neighboring genes is important to study TE functional role in gene regulation. We implemented a pipeline which is capable to reveal the positional and directional relationship between TEs and adjacent gene distribution in the host genome. Our tool is freely available here: https://github.com/marieBvr/TEs_genes_relationship_pipeline

show abstract

Application of Next Generation Technologies

Rojas

Ergun

Accorsi

et al. 2021

Neuromuscular Disorders

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ayse Ergun

Strain Level Microbial Detection and Quantification with Applications to Single Cell Metagenomics

Strain level microbial detection and quantification with applications to single cell metagenomics

A pipeline to detect the relationship between transposable elements and adjacent genes in host genomes

Pipeline to detect the positional and directional relationship between transposable elements and adjacent genes in host genome

Application of Next Generation Technologies

Contact Info

Product

Resources

About