A set of amino acid descriptors including hydrophobic, stereo and electrical properties were applied to construct quantitative structure-activity relationships (QSARs) models of three peptides datasets (angiotensin-converting enzyme inhibitor dipeptides, bactericidal peptides and oxytocin peptides) with stepwise multiple regression combined partial least squares regression (SMR-PLS). The results of QASRs models are very robust, with multiple correlation coefficients (R 2 ), and cross validation (Q 2 ) equal to 0.687, 0.671; 0.977, 0.890 and 0.950, 0.802 respectively. The robust models show the descriptors can be further expanded for polypeptides and serve as a useful quantitative tool for the rational drug design and discovery.Keywords: Quantitative structure-activity relationship; Peptides; Stepwise multiple regression; Partial least squares regression.
INTRODUCTIONPeptides play an important role in all living systems. They act as hormones, enzyme substrates, inhibitors, neurotransmitters and immunomodulating agents, which are driving considerable pharmacological interest in design and application of new drugs. 1 Quantitative structure-activity relationships (QSARs) has been brought into the spotlight, which is involved in pharmaceutical chemistry, pharmacology and the foundation of drug design. The nature of QSAR is to express the relation of structural features and biological activities. If the relationship between the peptide structures and biological activities can be confirmed, a large amount of peptide drugs will be successfully synthesized on this basis. [2][3] In the 1960s, Sneath et al. first expressed peptide sequences by using semi-quantitative experimental 55 parameters of 20 coded amino acids and successfully predicted the activities of hypophamine. 4 Kidera et al collected 188 properties of the 20 natural amino acids and employed factor analysis to obtain 10 orthogonal factors which determined three dimensional structures of proteins. 5 Hellberge et al used principal component analysis (PCA) for 29 physicochemical properties to each of 20 natural amino acids, including electrostatic, stereo and hydrophobic properties, which were respectively encoded as Z 1 , Z 2 and Z 3 . 6-8 Z-scales was obtained and it has proven to be an useful descriptor for short peptides modeling. Based on the three dimensional descriptors, isotropic surface area (ISA) and electronic charge index (ECI), Collantes et al established good 3D-QSAR models. 9 As QSARs study have advanced, a series of descriptors were proposed, which can well represent the structural characterization of amino acids for QSAR models, such as MS-WHIM, MARCH-IN-SIDE, VHSE, T-scales, VSW, V, HESH and ST-scale etc. [10][11][12][13][14][15][16][17][18] Almost all the descriptors mentioned above were derived from the principal component analysis (PCA) of the data matrixes, which may cursorily explained physicochemical properties of amino acids, such as hydrophobicity, molecular volume, net charge etc. Each principal component is a linear combination to the...