Taylor Sorenson scite author profile

Taylor Sorenson

4Publications

57Citation Statements Received

215Citation Statements Given

How they've been cited

How they cite others

128

213

Affiliations

Massachusetts General Hospital, Harvard University

Publications

Order By: Most citations

Simulation-assisted machine learning

Deist

Patti

Wang

et al. 2019

View full text Add to dashboard Cite

Motivation In a predictive modeling setting, if sufficient details of the system behavior are known, one can build and use a simulation for making predictions. When sufficient system details are not known, one typically turns to machine learning, which builds a black-box model of the system using a large dataset of input sample features and outputs. We consider a setting which is between these two extremes: some details of the system mechanics are known but not enough for creating simulations that can be used to make high quality predictions. In this context we propose using approximate simulations to build a kernel for use in kernelized machine learning methods, such as support vector machines. The results of multiple simulations (under various uncertainty scenarios) are used to compute similarity measures between every pair of samples: sample pairs are given a high similarity score if they behave similarly under a wide range of simulation parameters. These similarity values, rather than the original high dimensional feature data, are used to build the kernel. Results We demonstrate and explore the simulation-based kernel (SimKern) concept using four synthetic complex systems—three biologically inspired models and one network flow optimization model. We show that, when the number of training samples is small compared to the number of features, the SimKern approach dominates over no-prior-knowledge methods. This approach should be applicable in all disciplines where predictive models are sought and informative yet approximate simulations are available. Availability and implementation The Python SimKern software, the demonstration models (in MATLAB, R), and the datasets are available at https://github.com/davidcraft/SimKern. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Learning the Language of Antibody Hypervariability

Singh

Sorenson³

et al. 2023

Preprint

View full text Add to dashboard Cite

Protein language models (PLMs) based on machine learning have demonstrated impressive success in predicting protein structure and function. However, general-purpose ("foundational") PLMs have limited performance in predicting antibodies due to the latter's hypervariable regions, which do not conform to the evolutionary conservation principles that the models rely on. In this study, we propose a new transfer learning framework called AbMAP, which fine-tunes foundational models for antibody-sequence inputs by supervising on antibody structure and binding specificity examples. Our feature representations accurately predict an antibody's 3D structure, mutational effects on antigen binding, and paratope identification. AbMAP's scalability paves the way for large-scale analyses of human antibody repertoires. AbMAP representations of repertoires reveal a remarkable overlap across individuals, transcending the limits of sequence analyses. Our findings provide compelling evidence for the hypothesis that antibody repertoires of individuals tend to converge towards comparable structural and functional coverage. We anticipate AbMAP will accelerate the efficient design and modeling of antibodies and expedite the discovery of antibody-based therapeutics.

show abstract

Surface ID: a geometry-aware system for protein molecular surface comparison

Riahi

Lee²,

Sorenson³

et al. 2023

View full text Add to dashboard Cite

Motivation A protein can be represented in several forms, including its 1D sequence, 3D atom coordinates, and molecular surface. A protein surface contains rich structural and chemical features directly related to the protein’s function such as its ability to interact with other molecules. While many methods have been developed for comparing similarity of proteins using the sequence and structural representations, computational methods based on molecular surface representation are limited. Results Here, we describe “Surface ID”, a geometric deep learning system for high-throughput surface comparison based on geometric and chemical features. Surface ID offers a novel grouping and alignment algorithm useful for clustering proteins by function, visualization, and in-silico screening of potential binding partners to a target molecule. Our method demonstrates top performance in surface similarity assessment, indicating great potential for protein functional annotation, a major need in protein engineering and therapeutic design. Availability Source code for the Surface ID model, trained weights and inference script are available under an open-source (Apache Version 2.0) license at https://github.com/Sanofi-Public/LMR-SurfaceID Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Weird Emotions? Causal Inferences and Emotional Language

Shears¹,

Sorenson²,

Ung³

et al. 2011

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Taylor Sorenson

Simulation-assisted machine learning

Learning the Language of Antibody Hypervariability

Surface ID: a geometry-aware system for protein molecular surface comparison

Weird Emotions? Causal Inferences and Emotional Language

Contact Info

Product

Resources

About