2022
DOI: 10.1101/2022.09.21.508821
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Predicting gene and protein expression levels from DNA and protein sequences with Perceiver

Abstract: The functions of an organism and its biological processes derive from the expression and activity of genes and proteins. Therefore quantifying and predicting gene and protein expression values is a crucial aspect of scientific research. Concerning the prediction of gene expression values, the available machine learning-based approaches use the gene sequence %with the succession of nitrogenous bases as inputs to the neural network models. Some techniques, including Xpresso and Basenjii, have been proposed to pr… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(12 citation statements)
references
References 28 publications
0
12
0
Order By: Relevance
“…Taking all lineages together, CNN’s OPP is improved by 14.3% by E2VD. The situation is similar in expression prediction task (Fig.5c), where E2VD outperforms the other two competitors on most lineages, with the OPP achieving an 10.2% improvement over DNAPerceiver [25] for the mixture of all lineages. Over-all, E2VD surpasses state-of-the-art methods across the board on out-of-distribution lineages, demonstrating enhanced generalization performance.…”
Section: Resultsmentioning
confidence: 75%
See 3 more Smart Citations
“…Taking all lineages together, CNN’s OPP is improved by 14.3% by E2VD. The situation is similar in expression prediction task (Fig.5c), where E2VD outperforms the other two competitors on most lineages, with the OPP achieving an 10.2% improvement over DNAPerceiver [25] for the mixture of all lineages. Over-all, E2VD surpasses state-of-the-art methods across the board on out-of-distribution lineages, demonstrating enhanced generalization performance.…”
Section: Resultsmentioning
confidence: 75%
“…The accuracy of the state-of-the-art method NN MM-GBSA [20] in the classification task is improved by 10.7% by E2VD, along with a 8% improvement of the Pearson’s Correlation Coefficient (PCC) in the regression task. Second, for expression prediction, three models from two previous studies [25, 26] are used for comparison. The accuracy of the state-of-the-art method DNAPerceiver [25] in the classification task is improved by E2VD by 20.8%, and its PCC in the regression task is improved by 13.5%.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Recent studies have used transformer‐based models to analyze protein expression data, particularly in drug discovery and development. For instance, Stefatini utilized transformers architecture to predict gene and protein expression levels from DNA and protein sequences [56]. At the same time, DeepMind proposed a model called Enformer to study how non‐coding DNA influences gene expression in different cell types [84].…”
Section: Applications In Proteome Bioinformaticsmentioning
confidence: 99%