Designing molecules
for drugs has been a hot topic for many decades.
However, it is hard and expensive to find a new molecule. Thus, the
cost of the final drug is also increased. Machine learning can provide
the fastest way to predict the biological activity of druglike molecules.
In the present work, machine learning models are trained for the prediction
of the biological activity of aromatase inhibitors. Data was collected
from the literature. Molecular descriptors are calculated to be used
as independent features for model training. The results showed that
the R
2 values for linear regression, random
forest regression, gradient boosting regression, and bagging regression
are 0.58, 0.84, 0.77, and 0.80, respectively. Using these models,
it is possible to predict the activity of new molecules in a short
period of time and at a reasonable cost. Furthermore, Tanimoto similarity
is used for similarity analysis, as well as a chemical database is
mined to search for similar molecules. Nonetheless, this study provides
a framework for repurposing other effective drug molecules to prevent
cancer.