<div>
<div>
<div>
<p>Finding new medicines is one of the most important tasks of pharmaceutical companies. One of the best approaches to finding a new drug
starts with answering this simple question: Given a known effective drug
X, what are the top 100 molecules in our database most similar to X?
Thus the essence of the problem is a nearest-neighbors search, and the
key question is how to define the distance between two molecules in the
database. In this paper, we investigate the use of topological, rather than
geometric, or chemical, signatures for molecules, and two notions of distance that come from comparing these topological signatures. We introduce PH_VS (Persistent Homology for Virtual Screening), a new
system for ligand-based screening using a topological technique known
as multi-parameter persistent homology. We show that our approach can
match or exceed a reasonable estimate of current state of the art (including
well-funded commercial tools), even with relatively little domain-specific
tuning. Indeed, most of the components we have built for this system are
general-purpose tools for data science and will be released soon as open
source software.
</p>
</div>
</div>
</div>