The K-means algorithm, routinely used in many scientific fields, generates clustering solutions that depend on the initial cluster coordinates. The number of solutions may be large, which can make locating the global minimum challenging. Hence, the topography of the cost function surface is crucial to understanding the performance of the algorithm. Here, we employ the energy landscape approach to elucidate the topography of the K-means cost function surface for Fisher’s Iris dataset. For any number of clusters, we find that the solution landscapes have a funneled structure that is usually associated with efficient global optimization. An analysis of the barriers between clustering solutions shows that the funneled structures result from remarkably small barriers between almost all clustering solutions. The funneled structure becomes less well-defined as the number of clusters increases, and we analyze kinetic analogs to quantify the increased difficulty in locating the global minimum for these different landscapes.
Rotamers, namely amino acid side chain conformations
common to
many different peptides, can be compiled into libraries. These rotamer
libraries are used in protein modeling, where the limited conformational
space occupied by amino acid side chains is exploited. Here, we construct
a sequence-dependent rotamer library from simulations of all possible
tripeptides, which provides rotameric states dependent on adjacent
amino acids. We observe significant sensitivity of rotamer populations
to sequence and find that the library is successful in locating side
chain conformations present in crystal structures. The library is
designed for applications with basin-hopping global optimization,
where we use it to propose moves in conformational space. The addition
of rotamer moves significantly increases the efficiency of protein
structure prediction within this framework, and we determine parameters
to optimize efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.