<p>Evolution of metabolism is a longstanding yet unresolved question, and
several hypotheses were proposed to address this complex process from a
Darwinian point of view. Modern statistical bioinformatic approaches targeted to the comparative
analysis of genomes are being used to detect signatures of natural selection at
the gene and population level, as an attempt to understand the origin of
primordial metabolism and its expansion. These studies, however, are still mainly
centered on genes and the proteins they encode, somehow neglecting the small
organic chemicals that support life processes. In this work, we selected
steroids as an ancient family of metabolites
widely distributed in all eukaryotes and applied unsupervised machine learning techniques to reveal the traits that natural selection has
imprinted on molecular properties throughout the evolutionary process. Our
results clearly show that sterols, the primal steroids that first appeared,
have more conserved properties and that, from then on, more complex compounds
with increasingly diverse properties have emerged, suggesting that chemical diversification
parallels the expansion of biological complexity. In a wider context, these
findings highlight the worth of chemoinformatic approaches to a better understanding the evolution of
metabolism.</p>