2022
DOI: 10.1101/2022.03.22.485366
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Seqrutinator: Non-Functional Homologue Sequence Scrutiny for the Generation of large Datatsets for Protein Superfamily Analysis

Abstract: BackgroundIn recent years protein protein bioinformatics has resulted in much improved algorithms for multiple sequence alignment (MSA) and phylogeny. Few attention has been paid to sequence selection whereas particularly recently published complete proteomes often have many sequences that are partial or derive from pseudogenes. Not only do these sequences add noise to the MSA, phylogeny and other downstream computational analyses, they actually also instigate many errors in the processing of the MSAs and ther… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 60 publications
0
1
0
Order By: Relevance
“…To avoid the high variability in the N- and C-end regions, the multiple sequence alignment was trimmed in these regions according to the first and last residues corresponding to secondary structure in the reference sequence. Subsequently, the processed datasets were subjected to sequence scrutiny using Seqrutinator [53]. Next, in order to avoid overzealous scrutiny, we applied a recovery strategy.…”
Section: Methodsmentioning
confidence: 99%
“…To avoid the high variability in the N- and C-end regions, the multiple sequence alignment was trimmed in these regions according to the first and last residues corresponding to secondary structure in the reference sequence. Subsequently, the processed datasets were subjected to sequence scrutiny using Seqrutinator [53]. Next, in order to avoid overzealous scrutiny, we applied a recovery strategy.…”
Section: Methodsmentioning
confidence: 99%