2018
DOI: 10.1109/tnnls.2017.2727545
|View full text |Cite
|
Sign up to set email alerts
|

Fast Kronecker Product Kernel Methods via Generalized Vec Trick

Abstract: Kronecker product kernel provides the standard approach in the kernel methods literature for learning from graph data, where edges are labeled and both start and end vertices have their own feature representations. The methods allow generalization to such new edges, whose start and end vertices do not appear in the training data, a setting known as zero-shot or zero-data learning. Such a setting occurs in numerous applications, including drug-target interaction prediction, collaborative filtering and informati… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
9
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(9 citation statements)
references
References 56 publications
0
9
0
Order By: Relevance
“…We constructed protein kinase kernel using normalized Smith–Waterman alignment scores between full amino acid sequences, and four Tanimoto compound kernels based on the following fingerprints implemented in rcdk R package 37 : (i) 881-bit fingerprint defined by PubChem (pubchem), (ii) path-based 1024-bit fingerprint (standard), (iii) 1024-bit fingerprint based on the shortest paths between atoms taking into account ring systems and charges (shortestpath), and (iv) extended connectivity 1024-bit fingerprint with a maximum diameter set to 6 (ECFP6; circular). We used CGKronRLS as a learning algorithm (implementation available at https://github.com/aatapa/RLScore ) 38 . We conducted a nested cross-validation in order to evaluate the generalization performance of CGKronRLS with each pair of kinase and compound kernels as well as to tune the regularization hyperparameter of the model.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We constructed protein kinase kernel using normalized Smith–Waterman alignment scores between full amino acid sequences, and four Tanimoto compound kernels based on the following fingerprints implemented in rcdk R package 37 : (i) 881-bit fingerprint defined by PubChem (pubchem), (ii) path-based 1024-bit fingerprint (standard), (iii) 1024-bit fingerprint based on the shortest paths between atoms taking into account ring systems and charges (shortestpath), and (iv) extended connectivity 1024-bit fingerprint with a maximum diameter set to 6 (ECFP6; circular). We used CGKronRLS as a learning algorithm (implementation available at https://github.com/aatapa/RLScore ) 38 . We conducted a nested cross-validation in order to evaluate the generalization performance of CGKronRLS with each pair of kinase and compound kernels as well as to tune the regularization hyperparameter of the model.…”
Section: Methodsmentioning
confidence: 99%
“…To enable a fine-grained discrimination of binding affinities between similar targets (e.g., kinase family members), the team Q.E.D explicitly introduced similarity matrices of compounds and targets as input features into their regression model. The regression model was implemented as an ensemble version (uniformly averaged predictor) of 440 CGKronRLS regressors (CGKronRLS v0.81) 38 , 40 , but with different choices of regularization strengths [0.1, 0.5, 1.0, 1.5, 2.0], training epochs [400, 410, …, 500], and similarity matrices: the protein similarity matrix was derived based on the normalized striped Smith–Waterman alignment scores 41 between full protein sequences ( https://github.com/mengyao/Complete-Striped-Smith-Waterman-Library ). Eight different alternatives of compound similarity matrices were computed using both Tanimoto and Dice similarity metrics for different variants of 1024-bit Morgan fingerprints 42 (‘radius’ [2, 3] and ‘useChirality’ [True, False], implementation available at https://github.com/rdkit/rdkit ).…”
Section: Methodsmentioning
confidence: 99%
“…A conventional Stochastic Gradient Descent (SGD) 58 can result in slow convergence. Therefore, we use an alternative approach that leverages the specific structure of our embedding φ , as was previously done in Airola and Pahikkala 59 . Specifically, we exploit: (1) the tensor product nature of φ and (2) the fact that the sizes n M and n P of the input databases are much smaller than the number n Z of interactions.…”
Section: Methodsmentioning
confidence: 99%
“…Gaussian processes have been used in many applications for temporal and spatial prediction such as environmental surveillance [ 19 ], reconstruction of sea surface temperatures [ 20 ], drug–target interaction prediction [ 21 ], global land-surface precipitation prediction [ 22 ], and wind power forecasting [ 23 ] as well as spatiotemporal modeling [ 24 , 25 ]. There is also a significant number of studies on Gaussian processes with application to epidemiology [ 26 29 ].…”
Section: Methodsmentioning
confidence: 99%
“…For large data sets, Gaussian processes might become computationally intensive. Several decomposition algorithms have been previously proposed to make the inference faster such as Nyström approximation [ 11 ], approximation using Hadamard and diagonal matrices [ 30 ], or Kronecker methods [ 21 , 31 – 36 ].…”
Section: Methodsmentioning
confidence: 99%