Summary
Mistakes in linking a patient’s biological samples with their phenotype data can confound RNA-Seq studies. The current method for avoiding such sample mix-ups is to test for inconsistencies between biological data and known phenotype data such as sex. However, in DNA studies a common QC step is to check for unexpected relatedness between samples. Here, we extend this method to RNA-Seq, which allows the detection of duplicated samples without relying on identifying inconsistencies with phenotype data.
Results
We present RNASeq_similarity_matrix: an automated tool to generate a sequence similarity matrix from RNA-Seq data, which can be used to visually identify sample mix-ups. This is particularly useful when a study contains multiple samples from the same individual, but can also detect contamination in studies with only one sample per individual.
Availability and implementation
RNASeq_similarity_matrix has been made available as a documented GPL licensed Docker image on www.github.com/nicokist/RNASeq_similarity_matrix.
Pathogen-driven selection and past interbreeding with archaic human lineages have resulted in differences in HLA-allele frequencies between modern human populations. Whether or not this variation affects pathogen subtype diversification is unknown. Here we show a strong positive correlation between ethnic diversity in African countries and both HIV-1 p24gag and subtype diversity. We demonstrate that ethnic HLA-allele differences between populations has influenced HIV-1 subtype diversification as the virus adapted to escape common antiviral immune responses. The evolution of HIV subtype B (HIV-B), which does not appear to be indigenous to Africa, is strongly affected by immune responses associated with Eurasian HLA variants acquired through adaptive introgression from Neanderthals and Denisovans. Furthermore, we show that the increasing and disproportionate number of HIV-infections among African Americans in the United States drive HIV-B evolution towards an Africa-centric HIV-1 state. Similar adaptation of other pathogens to HLA variants common in affected populations is likely.
Cytotoxic T lymphocyte (CTL) responses against the HIV Gag protein are associated with lowering viremia; however, immune control is undermined by viral escape mutations. The rapid viral mutation rate is a key factor, but recombination may also contribute. We hypothesized that CTL responses drive the outgrowth of unique intra-patient HIV-recombinants (URFs) and examined gag sequences from a Kenyan sex worker cohort. We determined whether patients with HLA variants associated with effective CTL responses (beneficial HLA variants) were more likely to carry URFs and, if so, examined whether they progressed more rapidly than patients with beneficial HLA-variants who did not carry URFs. Women with beneficial HLA-variants (12/52) were more likely to carry URFs than those without beneficial HLA variants (3/61) (p < 0.0055; odds ratio = 5.7). Beneficial HLA variants were primarily found in slow/standard progressors in the URF group, whereas they predominated in long-term non-progressors/survivors in the remaining cohort (p = 0.0377). The URFs may sometimes spread and become circulating recombinant forms (CRFs) of HIV and local CRF fragments were over-represented in the URF sequences (p < 0.0001). Collectively, our results suggest that CTL-responses associated with beneficial HLA variants likely drive the outgrowth of URFs that might reduce the positive effect of these CTL responses on disease progression.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.