2019
DOI: 10.1021/acs.jcim.9b00164
|View full text |Cite
|
Sign up to set email alerts
|

Data Mining Approach for Extraction of Useful Information About Biologically Active Compounds from Publications

Abstract: A lot of high quality data on the biological activity of chemical compounds are required throughout the whole drug discovery process: from development of computational models of the structure–activity relationship to experimental testing of lead compounds and their validation in clinics. Currently, a large amount of such data is available from databases, scientific publications, and patents. Biological data are characterized by incompleteness, uncertainty, and low reproducibility. Despite the existence of free… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0
1

Year Published

2019
2019
2025
2025

Publication Types

Select...
4
4

Relationship

2
6

Authors

Journals

citations
Cited by 18 publications
(14 citation statements)
references
References 39 publications
0
13
0
1
Order By: Relevance
“…However, quantitative predictions obtained using our web-service for large and diverse databases may be incorrectly biased towards high activity because the training sets with quantitative data are biased towards the high active substances. This problem is solved by classification models that could be recommended as the first choice for prediction [28].…”
Section: Discussionmentioning
confidence: 99%
“…However, quantitative predictions obtained using our web-service for large and diverse databases may be incorrectly biased towards high activity because the training sets with quantitative data are biased towards the high active substances. This problem is solved by classification models that could be recommended as the first choice for prediction [28].…”
Section: Discussionmentioning
confidence: 99%
“…We used the set of 148 publications abstracts collected from NCBI PubMed. We used the workflow developed earlier (Tarasova et al, 2019). In this workflow we were focused on the publications that included the description of HIV inhibitors and included the details of biological experiments used for their testing.…”
Section: Algorithm Realizationmentioning
confidence: 99%
“…Besides, the more pressing the problem for humanity is, the more articles devoted to this problem can be found in the repositories of scientific publications. The extraction of records from scientific publications provides the opportunity to analyze the information derived from primary sources; therefore, such an approach helps to obtain the most contemporary information (Cash, 2004;Tarasova et al, 2015Tarasova et al, , 2019Saik et al, 2016). Currently, text-mining technologies aimed at rapid automated extraction of specific information are under rigorous development.…”
Section: Introductionmentioning
confidence: 99%
“…To identify signaling pathways, first we manually mapped the initial entities, which were extracted by text-mining, to UniProt Accession numbers and obtained a list of 46 human proteins. Pathways enriched with 46 genes, were identified from the KEGG database [ 18 ] using the “Enrichr” R package. We selected pathways which included at least 3 genes from the 46 ones and adjusted the p -value to less than 0.05.…”
Section: Methodsmentioning
confidence: 99%
“…In addition, there are a lot of data on the molecular mechanisms of HIV infection, regarding multiple pathways of virus–host interactions and the development of novel therapeutic approaches. Text and data mining approaches can be helpful for fast and accurately extracting information about chemical compounds and their biological activities, as well as proteins associated with molecular mechanisms of disease development [ 17 , 18 ]. In this study, we applied text and data mining approaches to identify possible molecular pathways shared by HIV-1 and SARS-CoV-2.…”
Section: Introductionmentioning
confidence: 99%