2022
DOI: 10.1016/j.csbj.2022.07.001
|View full text |Cite
|
Sign up to set email alerts
|

Research progress of reduced amino acid alphabets in protein analysis and prediction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(4 citation statements)
references
References 96 publications
1
3
0
Order By: Relevance
“…This work demonstrated that the kmer AAR approach can be used as the basis for machine learning models capable of classifying sets of functionally related proteins with little sequence similarity. We concluded that the AAR allows greater flexibility in sequence representation and captures similarity between sequences that would not be detected via conventional methods, and these observations have been borne out in other studies as well ( Hauswedell et al , 2014 ; Liang et al , 2022 ). Thus, AAR creates simplified representations of proteins in the form of flexible feature vectors, which can then be used to construct machine learning models for functional prediction, explore sequence similarities in newly sequenced datasets (e.g.…”
Section: Introductionsupporting
confidence: 65%
“…This work demonstrated that the kmer AAR approach can be used as the basis for machine learning models capable of classifying sets of functionally related proteins with little sequence similarity. We concluded that the AAR allows greater flexibility in sequence representation and captures similarity between sequences that would not be detected via conventional methods, and these observations have been borne out in other studies as well ( Hauswedell et al , 2014 ; Liang et al , 2022 ). Thus, AAR creates simplified representations of proteins in the form of flexible feature vectors, which can then be used to construct machine learning models for functional prediction, explore sequence similarities in newly sequenced datasets (e.g.…”
Section: Introductionsupporting
confidence: 65%
“…Additional to sequence- and structure-derived physicochemical properties, the presence/absence in protein sequences of certain k-mers was implemented as variable. To reduce sequence complexity, a reduced 7-letter aminoacid alphabet was used, as previously implemented in various machine learning methods applied to protein sequences 36,78 . A novel alphabet based on amino acid properties was designed, to define k-mers that may represent key motifs in protein physicochemistry (Supplementary Table 4b).…”
Section: Methodsmentioning
confidence: 99%
“…Reduced amino acid alphabets are a common feature of protein aligners ( Liang et al 2022 ). A reduced amino acid alphabet is a small(er) alphabet, where groups of amino acids are each represented by a single letter.…”
Section: Methodsmentioning
confidence: 99%