KDEformer: Accelerating Transformers via Kernel Density Estimation

Zandieh, Amir; Han, In-Su; Daliri, Majid; Karbasi, Amin

doi:10.48550/arxiv.2302.02451

Cited by 2 publications

(1 citation statement)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Research conducted by various sources such as [CGRS19, KKL20, WLK + 20, DKOD20, KVPF20, CDW + 21, CDL + 22] has underscored this perspective. Due to that motivation, [ZHDK23,AS23] study the computation of the attention matrix from the hardness perspective and purpose faster algorithms.…”

Section: Algorithmic Regularizationmentioning

confidence: 99%

Solving Regularized Exp, Cosh and Sinh Regression Problems

Li¹,

Song²,

Zhou³

2023

Preprint

View full text Add to dashboard Cite

In modern machine learning, attention computation is a fundamental task for training large language models such as Transformer, GPT-4 and ChatGPT. In this work, we study exponential regression problem which is inspired by the softmax/exp unit in the attention mechanism in large language models. The standard exponential regression is non-convex. We study the regularization version of exponential regression problem which is a convex problem. We use approximate newton method to solve in input sparsity time.Formally, in this problem, one is given matrix A ∈ R n×d , b ∈ R n , w ∈ R n and any of functions exp, cosh and sinh denoted as f . The goal is to find the optimal x that minimize 0.5 f (Ax) − b 2 2 + 0.5 diag(w)Ax 2 2 . The straightforward method is to use the naive Newton's method. Let nnz(A) denote the number of non-zeros entries in matrix A. Let ω denote the exponent of matrix multiplication. Currently, ω ≈ 2.373. Let ǫ denote the accuracy error. In this paper, we make use of the input sparsity and purpose an algorithm that use log( x 0 −x * 2 /ǫ) iterations and O(nnz(A) + d ω ) per iteration time to solve the problem.

show abstract

Section: Algorithmic Regularizationmentioning

confidence: 99%

Solving Regularized Exp, Cosh and Sinh Regression Problems

Li¹,

Song²,

Zhou³

2023

Preprint

View full text Add to dashboard Cite

show abstract

Protein–protein and protein–nucleic acid binding site prediction via interpretable hierarchical geometric deep learning

Zhang,

Han,

Liu

2024

GigaScience

View full text Add to dashboard Cite

Identification of protein–protein and protein–nucleic acid binding sites provides insights into biological processes related to protein functions and technical guidance for disease diagnosis and drug design. However, accurate predictions by computational approaches remain highly challenging due to the limited knowledge of residue binding patterns. The binding pattern of a residue should be characterized by the spatial distribution of its neighboring residues combined with their physicochemical information interaction, which yet cannot be achieved by previous methods. Here, we design GraphRBF, a hierarchical geometric deep learning model to learn residue binding patterns from big data. To achieve it, GraphRBF describes physicochemical information interactions by designing an enhanced graph neural network and characterizes residue spatial distributions by introducing a prioritized radial basis function neural network. After training and testing, GraphRBF shows great improvements over existing state-of-the-art methods and strong interpretability of its learned representations. Applying GraphRBF to the SARS-CoV-2 omicron spike protein, it successfully identifies known epitopes of the protein. Moreover, it predicts multiple potential binding regions for new nanobodies or even new drugs with strong evidence. A user-friendly online server for GraphRBF is freely available at http://liulab.top/GraphRBF/server.

show abstract

KDEformer: Accelerating Transformers via Kernel Density Estimation

Cited by 2 publications

References 14 publications

Solving Regularized Exp, Cosh and Sinh Regression Problems

Solving Regularized Exp, Cosh and Sinh Regression Problems

Protein–protein and protein–nucleic acid binding site prediction via interpretable hierarchical geometric deep learning

Contact Info

Product

Resources

About