2023
DOI: 10.1128/spectrum.04085-22
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Antigenic Distance from Genetic Data for PRRSV-Type 1: Applications of Machine Learning

Abstract: Understanding cross-protection between cocirculating PRRSV1 strains is crucial to reducing losses associated with PRRS outbreaks on farms. While experimental studies to determine cross-protection are instrumental, these in vivo studies are not always practical or timely for the many cocirculating and emerging PRRSV strains. In this study, we demonstrate the ability to rapidly estimate potential immunologic cross-reaction between different PRRSV1 strains in silico … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2
2

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 89 publications
0
3
0
Order By: Relevance
“…While the ORF5 gene is immunologically important, other parts of the genome contribute to the antigenicity and virulence (20, 44, 45). Whole genome data would be needed to understand the interplay between genotype and phenotype, and observed clinical manifestations are also influenced by external factors (e.g., co-infections); science has not yet progressed to the point that we can predict phenotype from whole genomes for PRRSV (22). That being said, sequencing conducted by animal health professionals is often conducted for epidemiological monitoring purposes, and this informed the level of granularity that we tried to achieve in this analysis.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…While the ORF5 gene is immunologically important, other parts of the genome contribute to the antigenicity and virulence (20, 44, 45). Whole genome data would be needed to understand the interplay between genotype and phenotype, and observed clinical manifestations are also influenced by external factors (e.g., co-infections); science has not yet progressed to the point that we can predict phenotype from whole genomes for PRRSV (22). That being said, sequencing conducted by animal health professionals is often conducted for epidemiological monitoring purposes, and this informed the level of granularity that we tried to achieve in this analysis.…”
Section: Discussionmentioning
confidence: 99%
“…Within the ∼15 kb PRRSV-2 genome, open reading frame 5 (ORF5) encodes for a major envelope protein (glycoprotein 5 - GP5), which is involved in inducing virus neutralizing antibodies and cross-protection among PRRSV variants (1820). Even though ORF5 accounts for only 4% of the genome, its genetic variability and apparent immunologic importance (3, 2022) has made this gene the target of nearly all genetic sequencing conducted by the swine industry, with thousands of sequences generated per year in the U.S. alone (4). Stakeholder preference for ORF5 rather than whole genome sequencing also relates to lower cost, rapid turnaround time, and the higher probability of successfully obtaining a sequence from samples of various types and quality.…”
Section: Introductionmentioning
confidence: 99%
“…These features were VP1 amino acid distance and site-wise differences in amino acid positions (4, 13, 24, 32, 33, 43, 48, 49, 57, 69, 97, 100, 124, 135, 139-141, 143, 145, 149-151, 154, 156, 159, 166, 173, 175, 195, 198, 199, 210, 213, 214) Additionally, from the random forest model, model features were ranked in their importance to the performance of our model using mean decrease in model accuracy when a features' data were randomized relative to the outcome (the relative prediction strength of a variable) and improvement in Gini index (measure of node impurity associated with a variable) when data were split on a variable (37,38). This allowed us to highlight highly ranked amino acids in our VP1data, and considered the relative importance of different amino acid sites based on their role in improving model accuracy and node purity in outcome classification (15,39).…”
Section: Training and Testing Machine Learning Modelsmentioning
confidence: 99%