2015
DOI: 10.1093/bioinformatics/btv395
|View full text |Cite
|
Sign up to set email alerts
|

Inference of Markovian properties of molecular sequences from NGS data and applications to comparative genomics

Abstract: Supplementary data are available at Bioinformatics online.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
46
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
7
1

Relationship

4
4

Authors

Journals

citations
Cited by 27 publications
(47 citation statements)
references
References 40 publications
1
46
0
Order By: Relevance
“…\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$d_2^*$\end{document} has shown excellent performance in other applications for assessing relatedness of whole genomes or metagenomic samples (22,2628), so perhaps it is not surprising that it was the top performing measure in virus-host prediction. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$d_2^*$\end{document} along with other measures like \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$d_2^S$\end{document}, Hao and Teeling , distinct from simpler measures like Eu and Ma , in that they take into consideration the background oligonucleotide patterns of the two sequences being compared.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$d_2^*$\end{document} has shown excellent performance in other applications for assessing relatedness of whole genomes or metagenomic samples (22,2628), so perhaps it is not surprising that it was the top performing measure in virus-host prediction. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$d_2^*$\end{document} along with other measures like \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$d_2^S$\end{document}, Hao and Teeling , distinct from simpler measures like Eu and Ma , in that they take into consideration the background oligonucleotide patterns of the two sequences being compared.…”
Section: Discussionmentioning
confidence: 99%
“…These alignment-free dissimilarity measures compare two sequences based on the normalized ONFs where the expected ONFs based on a Markov model are removed from the observed ONF (see Materials and Methods section). These measures have shown excellent performance in related applications of sequence analysis—phylogenetic relatedness of genomes and congruence in sample clustering based on analysis of metagenomes and environmental conditions of those samples (22,2628). These sophisticated dissimilarity measures have a potential advantage over simpler ONF measures like Euclidean and Manhattan distances that only use observed ONF patterns.…”
Section: Introductionmentioning
confidence: 99%
“…These data have provided a new tool for the genotyping and assessment of genetic resources in non-model species (Davey et al, 2011 ; Ekblom and Galindo, 2011 ; Lin et al, 2011 ). Moreover, the process and analytic power required to handle the huge sequencing data have improved (Aflitos et al, 2015 ; Ren et al, 2015 ). Therefore, the development of a reliable and effective molecular marker system from sequencing data has become feasible in many non-model plants (Durand et al, 2010 ; Yadav et al, 2011 ).…”
Section: Introductionmentioning
confidence: 99%
“…To address this gap, Fixed Order Markov Chains (FOMC) were used to model the background genome sequences, as reported in previous studies2223. There are several limitations during the applications of FOMC: (1) The order of Markov Chain (MC) needs to be set manually.…”
mentioning
confidence: 99%