We describe a new clustering program, SONHICA (Simple Optimized
Non‐HIerarchical Cluster Analysis), developed to analyze large data sets of
molecular conformations. Unlike traditional clustering methods, SONHICA
does not make use of an overall index, like a distance, to evaluate
similarity between objects. Each descriptor variable is compared
individually on the basis of a preset threshold value. This assures high
control and sensitivity over the input variables. In addition, periodic and
nonperiodic descriptors, such as dihedral angles and interatomic distances,
can easily be used together. SONHICA generates clusters with the highest
possible density and all pairs of objects within a cluster are similar.
These features make SONHICA particularly suitable for the analysis of data
sets which tend to form globular clusters. This method was applied to the
analysis of a modified linear tetrapeptide, ITF1697, under investigation
for its anti‐ischemic properties, and a cyclic pentapeptide, BQ123, a
potent antagonist of endothelin A. On the basis of the results presented
here, SONHICA appears to be an interesting new tool in the field of the
clustering methods applied to the analysis of molecular conformations.
© 1997 John Wiley & Sons, Inc. J Comput Chem 18: 1295–1311, 1997