Rare diseases impact
hundreds of millions of individuals worldwide.
However, few therapies exist to treat the rare disease population
because financial resources are limited, the number of patients affected
is low, bioactivity data is often nonexistent, and very few animal
models exist to support preclinical development efforts. Sialidosis
is an ultrarare lysosomal storage disorder in which mutations in the
NEU1 gene result in the deficiency of the lysosomal enzyme sialidase-1.
This enzyme catalyzes the removal of sialic acid moieties from glycoproteins
and glycolipids. Therefore, the defective or deficient protein leads
to the buildup of sialylated glycoproteins as well as several characteristic
symptoms of sialidosis including visual impairment, ataxia, hepatomegaly,
dysostosis multiplex, and developmental delay. In this study, we used
a bibliometric tool to generate links between lysosomal storage disease
(LSD) targets and existing bioactivity data that could be curated
in order to build machine learning models and screen compounds in silico. We focused on sialidase as an example, and we
used the data curated from the literature to build a Bayesian model
which was then used to score compound libraries and rank these molecules
for in vitro testing. Two compounds were identified
from in vitro testing using microscale thermophoresis,
namely sulfameter (K
d 2.15 ± 1.02
μM) and mexenone (K
d 8.88 ±
4.02 μM), which validated our approach to identifying new molecules
binding to this protein, which could represent possible drug candidates
that can be evaluated further as potential chaperones for this ultrarare
lysosomal disease for which there is currently no treatment. Combining
bibliometric and machine learning approaches has the ability to assist
in curating small molecule data and model building, respectively,
for rare disease drug discovery. This approach also has the capability
to identify new compounds that are potential drug candidates.