2023
DOI: 10.7717/peerj.14779
|View full text |Cite
|
Sign up to set email alerts
|

Complet+: a computationally scalable method to improve completeness of large-scale protein sequence clustering

Abstract: A major challenge for clustering algorithms is to balance the trade-off between homogeneity, i.e., the degree to which an individual cluster includes only related sequences, and completeness, the degree to which related sequences are broken up into multiple clusters. Most algorithms are conservative in grouping sequences with other sequences. Remote homologs may fail to be clustered together and instead form unnecessarily distinct clusters. The resulting clusters have high homogeneity but completeness that is … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 27 publications
(44 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?