2020
DOI: 10.3389/fbioe.2020.00635
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of G Protein-Coupled Receptors With CTDC Extraction and MRMD2.0 Dimension-Reduction Methods

Abstract: The G Protein-Coupled Receptor (GPCR) family consists of more than 800 different members. In this article, we attempt to use the physicochemical properties of Composition, Transition, Distribution (CTD) to represent GPCRs. The dimensionality reduction method of MRMD2.0 filters the physicochemical properties of GPCR redundancy. Matplotlib plots the coordinates to distinguish GPCRs from other protein sequences. The chart data show a clear distinction effect, and there is a well-defined boundary between the two. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 88 publications
0
8
0
Order By: Relevance
“…The latter two groups represent the significance of global content of different amino acid groups and the order of amino acid residues within Cas12 sequences respectively. The C/T/D descriptors represent the Composition, Transition, and Distribution patterns of a specific physicochemical property in protein sequences 4446 . Accordingly, the 20 amino acids are categorized into three main groups based on seven physicochemical attributes namely hydrophobicity, Van der Waals Volume, polarity, polarizability, charge, secondary structures, and solvent accessibility together accounting for total C/T/D descriptors 47,48 .Composition descriptors (CTDC), represent the global composition of the three encoded amino acid groups of a particular property within the complete protein sequence. The top contributing descriptor “normwaalsvolume.G3” in Figure 2b is a CTDC descriptor describing the importance of group-3 normalized Van der Waals Volume residues (M,H,K,F,R,Y,W) having large residual volume in the Cas12 protein sequences. These bulky residues in protein sequences are usually associated with bulkiness of the protein structure which may have implications on the three-dimensional structure and functional sites of the protein.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The latter two groups represent the significance of global content of different amino acid groups and the order of amino acid residues within Cas12 sequences respectively. The C/T/D descriptors represent the Composition, Transition, and Distribution patterns of a specific physicochemical property in protein sequences 4446 . Accordingly, the 20 amino acids are categorized into three main groups based on seven physicochemical attributes namely hydrophobicity, Van der Waals Volume, polarity, polarizability, charge, secondary structures, and solvent accessibility together accounting for total C/T/D descriptors 47,48 .Composition descriptors (CTDC), represent the global composition of the three encoded amino acid groups of a particular property within the complete protein sequence. The top contributing descriptor “normwaalsvolume.G3” in Figure 2b is a CTDC descriptor describing the importance of group-3 normalized Van der Waals Volume residues (M,H,K,F,R,Y,W) having large residual volume in the Cas12 protein sequences. These bulky residues in protein sequences are usually associated with bulkiness of the protein structure which may have implications on the three-dimensional structure and functional sites of the protein.…”
Section: Resultsmentioning
confidence: 99%
“…The C/T/D descriptors represent the Composition, Transition, and Distribution patterns of a specific physicochemical property in protein sequences 4446 . Accordingly, the 20 amino acids are categorized into three main groups based on seven physicochemical attributes namely hydrophobicity, Van der Waals Volume, polarity, polarizability, charge, secondary structures, and solvent accessibility together accounting for total C/T/D descriptors 47,48 .Composition descriptors (CTDC), represent the global composition of the three encoded amino acid groups of a particular property within the complete protein sequence.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Considering that there are still more features in 11,045 dimensions, we tried to use MRMD2.0 [ 36 38 ] for feature dimension reduction. MRMD2.0 integrates rich feature selection algorithms and feature ranking algorithms and is superior to the single feature selection algorithm.…”
Section: Discussionmentioning
confidence: 99%
“…CTDC features represent the distribution patterns of amino acids for specific structural or physicochemical properties in a protein or peptide sequence. CTDC refers to the composition of CTD descriptors that are computed by the following procedures: 1) transforming amino acid sequences into sequences for structural or physicochemical properties; 2) according to Tomii and Kanehisa’s major amino acid index clustering, 20 amino acids were divided into three groups for each of the seven different physicochemical properties, detailed calculation of which could be seen in previous studies ( Chen et al, 2020 ; Gu et al, 2020 ). In fact, CTDC has been successfully applied to the prediction of G protein-coupled receptors ( Gu et al, 2020 ).…”
Section: Methodsmentioning
confidence: 99%