2022
DOI: 10.1038/s41597-022-01779-4
|View full text |Cite
|
Sign up to set email alerts
|

A dataset comprised of binding interactions for 104,972 antibodies against a SARS-CoV-2 peptide

Abstract: The dataset presented here contains quantitative binding scores of scFv-format antibodies against a SARS-CoV-2 target peptide collected via an AlphaSeq assay that can be used in the development and benchmarking of machine learning models. Starting from three seed sequences identified from a phage display campaign using a human naïve library, four sets of 29,900 antibodies were designed in silico by creating all k = 1 mutations and random k = 2 and k = 3 mutations throughout the complementary-determining region… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(25 citation statements)
references
References 26 publications
0
25
0
Order By: Relevance
“…The antibody covid specificity Dataset (AlphaSeq) [40] was downloaded from https://zenodo.org/record/5095284. AlphaSeq consists of 71,415 paired heavy and light chains, with continuous measurements of their affinity to an epitope of the SARS-CoV-2 spike protein.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The antibody covid specificity Dataset (AlphaSeq) [40] was downloaded from https://zenodo.org/record/5095284. AlphaSeq consists of 71,415 paired heavy and light chains, with continuous measurements of their affinity to an epitope of the SARS-CoV-2 spike protein.…”
Section: Methodsmentioning
confidence: 99%
“…We evaluate PLM-based antibody affinity predictions through the lens of the AlphaSeq dataset [40], which provides a continuous label for the estimated binding affinity of antibodies to a specific SARS-CoV-2 epitope, including several replicate measurements for many sequences. We use PLMs to compute fixed-size embeddings of the antibody protein sequences, which reduces the task of predicting the affinity to a supervised regression problem, mapping embedding vectors to a continuous number.…”
Section: Model Architecture For Antibody Specificity Predictionmentioning
confidence: 99%
“…We generated our supervised training data using an engineered yeast mating assay and have published it separately to support its reuse [23]. This data includes 17,118 heavy chain scFv binding measurements and 26,223 light chain scFv binding measurements against a single target peptide and were collected in pooled, yeast-based mating assays.…”
Section: Resultsmentioning
confidence: 99%
“…Experimental Binding Measurements for Sequence-to-Affinity Model Training. We separately published the training data to support ease-of-reuse [23]. Briefly, experimental measurements were made, in technical triplicate, by A-Alpha Bio, LLC and are a "predicted affinity" value [27].…”
Section: Methodsmentioning
confidence: 99%
“…14H and 14L: The 14H and 14L datasets are sourced from the LL-SARS-CoV-2 database 36 . This database was designed to choose two heavy chains and two light chains from three main chains of antibodies as the framework for constructing an antibody library.…”
Section: Methodsmentioning
confidence: 99%