2019
DOI: 10.1101/838151
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Revisiting use of DNA characters in taxonomy with MolD - a tree independent algorithm to retrieve diagnostic nucleotide characters from monolocus datasets

Abstract: While DNA characters are increasingly used for phylogenetic inference, taxa delimitation and identification, their use for formal description of taxa (i.e. providing either a formal description or a diagnosis) remains scarce and inconsistent. The impediments are neither nomenclatural, nor conceptual, but rather methodological issues: lack of agreement of what DNA character should be provided, and lack of a suitable operational algorithm to identify such characters. Furthermore, the reluctance of using DNA data… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 47 publications
0
7
0
Order By: Relevance
“…Regarding the use of molecular data as diagnostic characters, we adopt the position of Nygren and Pleijel (2011). To this end, that is, to recover diagnostic combinations of nucleotides (DNCs) for pre‐defined groups of DNA sequences, we used the program MolD (Molecular Diagnoses) v. 1.3 (Fedosov et al, 2019). A pure composite nucleotide character— that is , a combination of nucleotides at specific positions in the DNA alignment that are shared by all members of a focus taxon, and by none of the non‐focus taxa members, is termed a primary diagnostic nucleotide combination (pDNC), whereas a diagnostic nucleotide combination, which combines several pDNCs (or characters from several pDNCs) aiming for an increased robustness of a diagnosis, is termed a secondary DNC (sDNC).…”
Section: Methodsmentioning
confidence: 99%
“…Regarding the use of molecular data as diagnostic characters, we adopt the position of Nygren and Pleijel (2011). To this end, that is, to recover diagnostic combinations of nucleotides (DNCs) for pre‐defined groups of DNA sequences, we used the program MolD (Molecular Diagnoses) v. 1.3 (Fedosov et al, 2019). A pure composite nucleotide character— that is , a combination of nucleotides at specific positions in the DNA alignment that are shared by all members of a focus taxon, and by none of the non‐focus taxa members, is termed a primary diagnostic nucleotide combination (pDNC), whereas a diagnostic nucleotide combination, which combines several pDNCs (or characters from several pDNCs) aiming for an increased robustness of a diagnosis, is termed a secondary DNC (sDNC).…”
Section: Methodsmentioning
confidence: 99%
“…The original Python code also allows the user to define their own set of reference sequences by simply adding to an existing list, and this option will also be included in future versions of the GUI-driven binaries. In addition, we have programmed a GUI for MolD (Fedosov et al 2019), a program that is tailored for recovering DNA-based diagnoses in large DNA dataset, and is capable of identifying diagnostic combinations of nucleotides (DNCs) in addition to single (pure) diagnostic sites. The crucial and unique functionality of MolD allows assembling DNA diagnoses that fulfil pre-defined criteria of reliability, which is achieved by repeatedly scoring diagnostic nucleotide combinations against datasets of insilico mutated sequences.…”
Section: Diagnosismentioning
confidence: 99%
“…The diagnosis of new species – rather than its lengthy description – represents the most important part of the alpha-taxonomic process, and in all Nomenclatural Codes, diagnosis can be based on molecular, as well as morphological characters (Renner, 2016). Several software tools have been proposed to extract diagnostic nucleotide positions of clades and species, either phylogeny-based (caos; Sarkar et al 2008) or primarily alignment-based (MolD, Fastachar, DeSignate: Fedosov et al 2019; Merckelbach & Borges 2020; Hütter et al 2020). In order to facilitate the use of such DNA characters in differential diagnoses of new species, we implemented a crucial new tool for DNA taxonomy named dnadiagnoser .…”
Section: Functionalities Implemented In Itaxotools 01mentioning
confidence: 99%
“…To facilitate such comparisons, the program also includes a series of standard reference sequences (such as the full Homo sapiens COI or cox1 gene) and allows as input unaligned sequences, which are then pairwise aligned against the reference sequence to identify diagnostic positions and label them according to their position in the reference sequence, a procedure that works reliably in sets of sequences with no or only few insertions or deletions such as COI. In addition, we have also programmed a GUI for MolD (Fedosov et al 2019), a program that is tailored for recovering DNA-based diagnoses in large DNA dataset, and is capable of identifying diagnostic combinations of nucleotides (DNCs) in addition to single (pure) diagnostic sites. The crucial and unique functionality of MolD allows assembling DNA diagnoses that fulfil pre-defined criteria of reliability, which is achieved by repeatedly scoring diagnostic nucleotide combinations against datasets of in-silico mutated sequences.…”
Section: Functionalities Implemented In Itaxotools 01mentioning
confidence: 99%