2019
DOI: 10.1101/744789
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Machine learning based imputation techniques for estimating phylogenetic trees from incomplete distance matrices

Abstract: Background: Due to the recent advances in sequencing technologies and species tree estimation methods capable of taking gene tree discordance into account, notable progress has been achieved in constructing large scale phylogenetic trees from genome wide data. However, substantial challenges remain in leveraging this huge amount of molecular data. One of the foremost among these challenges is the need for efficient tools that can handle missing data. Popular distance-based methods such as neighbor joining and … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 72 publications
(109 reference statements)
0
13
0
Order By: Relevance
“…In such case, the distance d ij for and is missing. Imputing such missing distances is important to estimate the comprehensive tree [ 35 ]. In this research, we proposed an alternative approach for imputing distances based on the eigenvalue of the indefinite inner product matrix .…”
Section: Methodsmentioning
confidence: 99%
“…In such case, the distance d ij for and is missing. Imputing such missing distances is important to estimate the comprehensive tree [ 35 ]. In this research, we proposed an alternative approach for imputing distances based on the eigenvalue of the indefinite inner product matrix .…”
Section: Methodsmentioning
confidence: 99%
“…Given the backbone tree with its associated and query sequences, the model outputs an embedding of the query and reference species which can be used as input to some distance-based phylogenetic placement tools, which then places the query sequences onto the reference tree. Bhattacharjee et al 115 addressed the data imputation problem in the incomplete distance matrix using autoencoders. However, the key limitation of these methods is that trees cannot be reliably embedded into a Euclidean space of low dimensions 116 .…”
Section: Minor Successes Of DLmentioning
confidence: 99%
“…Works using machine learning for phylogenetic tree construction already exists. In [1], they introduced an approach to the case where the distance matrix is incomplete. By using deep architectures, they could eliminate the need for a molecular clock assumption, representing a real-world occurrence of the problem.…”
Section: Related Workmentioning
confidence: 99%