Recently, there has been rapid development in model-induced drug development, which has the potential to reduce animal experiments and accelerate drug discovery. Physiologically based pharmacokinetic (PBPK) and machine learning (ML) models are commonly used in early drug discovery to predict drug properties. However, basic PBPK models require a large number of molecule-specific inputs from in vitro experiments, which hinders the efficiency and accuracy of these models. To address this issue, this paper introduces a new computational platform that combines ML and PBPK models. The platform predicts molecule PK profiles with high accuracy and without the need for experimental data. This study developed a whole-body PBPK model and ML models of plasma protein unbinding (fup), Caco-2 cell permeability, and total plasma clearance to predict the PK of small molecules. Pharmacokinetic profiles were simulated using a "bottom-up" PBPK modeling approach with ML inputs. Additionally, 40 compounds were used to evaluate the platform's accuracy. Results showed that the ML-PBPK model predicted the area under the concentration-time curve (AUC) with 62.5% accuracy within a 2-fold range, which was higher than using in vitro inputs with 47.5% accuracy. The ML-PBPK model platform provides high accuracy in prediction and reduces the number of experiments and time required compared to traditional PBPK approaches. The platform successfully predicts human PK parameters without in vitro and in vivo experiments and can potentially guide early drug discovery and development.
Antibody represents a specific class of proteins produced by the adaptive immunity as a response to invading pathogens, and mining the information implied in antibody amino acid sequences can benefit both antibody property prediction and novel therapeutic development. Protein-specific pre-training models have been used to extract latent representations from protein sequences containing structural, functional, and homologous information. However, there is still room for improvement in pre-training models on antibody sequence. On the one hand, existing protein pre-training models mainly utilize pre-training language models without fully considering the differences between protein sequences and language sequences; on the other hand, in comparison with other proteins, antibodies possess their uniqueness, which should be incorporated using specifically designed training methods. Here, we present a pre-trained model of antibody sequences, Pre-training with A Rational Approach for antibodies (PARA), that employs a training strategy conforming to antibody sequences patterns and an advanced NLP self-encoding model structure. We show PARA's performance on several tasks by comparing it to several published pre-trained models of antibodies. The results show that PARA significantly outperforms selected antibody pre-training models on these tasks, suggesting that PARA has an advantage in capturing antibody sequence information. To the best of our knowledge, PARA is the first antibody language model that takes into account the features of antibody sequences . We believe that the antibody latent representation provided by PARA can substantially facilitate the studies in relevant areas, such as antibody structure prediction, affinity prediction, and antibody de novo desig
Background and significance The global antibody drug market is worth over $200 billion in 2021 and is expected to reach $380 billion by 2030. Antibody discovery is one of the most critical steps that determine the crucial properties of antibody drugs, such as efficacy, safety, and developability. Traditional methods based on mouse immunization have many drawbacks limiting drug discovery, which include long time periods, high costs, inability to target function-specific epitopes, unsuitable for low immunogenic and difficult-to-prepare antigens, the need to sacrifice mice, the need for further humanization to reduce immunogenicity, and so on. Here we report an antibody de novo design computational workflow that utilizes high-quality internally produced antibody data and advanced AI models. Using this workflow, we can de novo design antibodies that bind to user-specified functional epitopes with high affinity and specificity. Compared with classical wet-lab methods, the entire process is shortened from several months to several days and suitable for low immunogenicity and difficult-to-prepare antigens. It is particularly noteworthy that due to the use of humanized mouse-generated antibodies (Renlite bearing common light chain from Biocytogen) as training data for AI models, the designed antibodies have a high degree of humanization and good developability, effectively avoiding issues such as ADA and aggregation in subsequent processes. Methods First, with the help of Renlite, we comprehensively combined mouse immunization, B cell sorting with FACS, NGS single-cell sequencing, and bioinformatics analysis to internally generate a large amount of high-quality antibody sequence data. Second, we developed AI models for antigen-specific antibody selection and epitope prediction (bioRxiv, 2022: 2022.12. 22.521634.) to mine antigen-specific antibodies and corresponding antigen epitopes in the data. Based on the processed high-quality data, we trained an affinity prediction model that can accurately predict whether an antigen epitope and antibody sequence pair can bind to each other. Besides, using the sequence data, we trained an antibody sequence pre-training language model (bioRxiv, 2023: 2023.01. 19.524683.), which can generate high-quality antibody sequences to simulate the antibodies produced by mouse immunization. Finally, integrating the above AI models, we established an antibody de novo design computational workflow to simulate the biological process of antibody generation and affinity maturation in the mouse immune system, which can be seen as a “DigitalMouse”. Results In a test case, 1 million antibodies were designed aiming at binding to specific epitope of an antigen. 10 antibodies were selected and expressed. Binding affinity was determined using BLI. Two antibodies out of 10 had KD of 194 nM and 336 nM, respectively, with a concentration dependent signal increase on BLI. These antibodies have great potential as the starting point of candidate molecules for further in vitro, in vivo experimental validation and clinical trials. Conclusions The AI-based antibody de novo design workflow will revolutionize the antibody discovery industry paradigm, greatly shorten the antibody discovery phase, reduce R&D costs, and expand antibody discovery to more antigen targets that are difficult with animal immunization. The computational workflow will have a profound impact on the entire biopharmaceutical industry.
High-content analysis (HCA) holds enormous potential for drug discovery and research, but widely used methods can be cumbersome and yield inaccurate results. Noise and high similarity in cell images impede the accuracy of deep learning-based image analysis. To address these issues, we introduce More Is Different (MID), a novel HCA method that combines cellular experiments, image processing, and deep learning modeling. MID effectively combines the convolutional neural network and Transformer to encode high-content images, effectively filtering out noisy signals and characterizing cell phenotypes with high precision. In comparative tests on drug-induced cardiotoxicity and mitochondrial toxicity classification, as well as compound classification, MID outperformed both DeepProfiler and CellProfiler, which are two highly recognized methods in HCA. We believe that our results demonstrate the utility and versatility of MID and anticipate its widespread adoption in HCA for advancing drug development and disease research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.