2018
DOI: 10.3389/fphar.2018.00613
|View full text |Cite
|
Sign up to set email alerts
|

Extending in Silico Protein Target Prediction Models to Include Functional Effects

Abstract: In silico protein target deconvolution is frequently used for mechanism-of-action investigations; however existing protocols usually do not predict compound functional effects, such as activation or inhibition, upon binding to their protein counterparts. This study is hence concerned with including functional effects in target prediction. To this end, we assimilated a bioactivity training set for 332 targets, comprising 817,239 active data points with unknown functional effect (binding data) and 20,761,260 ina… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
1
1

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 35 publications
0
3
0
Order By: Relevance
“…21 More specifically, published inhouse models were trained using a very different number of "binding" or "functionally active" compounds (pXC 50 better than 5.0 (10 μM) at a target) across the models, with a median of 752 training instances and a (∼6.5 times larger) standard deviation of 4954 training points per protein target model. 22 Benchmark results highlighted models trained with the fewest data points assigned significantly lower probability estimates (frequently below 0.5) because of the narrow coverage of chemical space. Training set size implicitly affects probability calibration performance between protein families to a disproportionate extent for this reason because enzymes and kinases dominate protein family distributions (∼30% and ∼34% of the training data set size 23 ).…”
Section: ■ Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…21 More specifically, published inhouse models were trained using a very different number of "binding" or "functionally active" compounds (pXC 50 better than 5.0 (10 μM) at a target) across the models, with a median of 752 training instances and a (∼6.5 times larger) standard deviation of 4954 training points per protein target model. 22 Benchmark results highlighted models trained with the fewest data points assigned significantly lower probability estimates (frequently below 0.5) because of the narrow coverage of chemical space. Training set size implicitly affects probability calibration performance between protein families to a disproportionate extent for this reason because enzymes and kinases dominate protein family distributions (∼30% and ∼34% of the training data set size 23 ).…”
Section: ■ Introductionmentioning
confidence: 99%
“…For example, bioactivity data (compound–protein target associations) in chemogenomic repositories comprise very different data distributions across the experimental data available (i.e., highly different numbers of compound annotations, target family classes, and the degree of diversity in the chemical matter tested) . More specifically, published in-house models were trained using a very different number of “binding” or “functionally active” compounds (pXC 50 better than 5.0 (10 μM) at a target) across the models, with a median of 752 training instances and a (∼6.5 times larger) standard deviation of 4954 training points per protein target model . Benchmark results highlighted models trained with the fewest data points assigned significantly lower probability estimates (frequently below 0.5) because of the narrow coverage of chemical space.…”
Section: Introductionmentioning
confidence: 99%
“…different target classes, number of compounds, with higher or lower diversity) 18 . For example, published models using in-house data comprised a very different number of active compounds across the proteins modelled, with a median of 752 and (~6.5 times larger) standard deviation of 4,954 compounds per protein target model 19 . Another study modelled 15 different protein families, with a range of 17 to 615 targets per family, and (comparatively large) standard deviation of 174 targets across families 20 .…”
Section: Introductionmentioning
confidence: 99%