Predicting the bioactivity of peptides is an important challenge in drug development and peptide research. In this study, numerical descriptive vectors (NDVs) for peptide sequences were calculated based on the physicochemical properties of amino acids (AAs) and principal component analysis (PCA). The resulted NDV had the same length as the peptide sequence, so that each entry of NDV corresponded to one AA in the sequence. They were then applied to quantitative structure−activity relationship (QSAR) analysis of angiotensin-converting enzyme (ACE) inhibitor dipeptides, bitter-tasting dipeptides, and nonameric binding peptides of the human leukocyte antigens (HLA-A*0201). Multiple linear regression was used to construct the QSAR models. For each peptide set, a proper subset of physicochemical properties was chosen by the ant colony optimization algorithm. The leave-one-out cross-validation (q loo 2 ) values were 0.855, 0.936, and 0.642 and the root-mean-square errors (RMSEs) were 0.450, 0.149, and 0.461. Our results revealed that the new numerical descriptive vector can afford extensive characterization of peptide sequence so that it can be easily employed in peptide QSAR studies. Moreover, the proposed numerical descriptive vectors were able to determine hot spot residues in the peptides under study.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.