Human endogenous retrovirus (HERV-K (HML-2)) proviruses are among the few endogenous retroviral elements in the human genome that retain coding sequence. HML-2 expression has been widely associated with human disease states, including different types of cancers as well as with HIV-1 infection. Understanding of the potential impact of this expression requires that it be annotated at the proviral level. Here, we utilized the high throughput capabilities of next-generation sequencing to profile HML-2 expression at the level of individual proviruses and secreted virions in the teratocarcinoma cell line Tera-1. We identified well-defined expression patterns, with transcripts emanating primarily from two proviruses located on chromosome 22, only one of which was efficiently packaged. Interestingly, there was a preference for transcripts of recently integrated proviruses, over those from other highly expressed but older elements, to be packaged into virions. We also assessed the promoter competence of the 5’ long terminal repeats (LTRs) of expressed proviruses via a luciferase assay following transfection of Tera-1 cells. Consistent with the RNASeq results, we found that the activity of most LTRs corresponded to their transcript levels.
Postinfectious hydrocephalus (PIH), which often follows neonatal sepsis, is the most common cause of pediatric hydrocephalus worldwide, yet the microbial pathogens underlying this disease remain to be elucidated. Characterization of the microbial agents causing PIH would enable a shift from surgical palliation of cerebrospinal fluid (CSF) accumulation to prevention of the disease. Here, we examined blood and CSF samples collected from 100 consecutive infant cases of PIH and control cases comprising infants with non-postinfectious hydrocephalus in Uganda. Genomic sequencing of samples was undertaken to test for bacterial, fungal, and parasitic DNA; DNA and RNA sequencing was used to identify viruses; and bacterial culture recovery was used to identify potential causative organisms. We found that infection with the bacterium Paenibacillus, together with frequent cytomegalovirus (CMV) coinfection, was associated with PIH in our infant cohort. Assembly of the genome of a facultative anaerobic bacterial isolate recovered from cultures of CSF samples from PIH cases identified a strain of Paenibacillus thiaminolyticus. This strain, designated Mbale, was lethal when injected into mice in contrast to the benign reference Paenibacillus strain. These findings show that an unbiased pan-microbial approach enabled characterization of Paenibacillus in CSF samples from PIH cases, and point toward a pathway of more optimal treatment and prevention for PIH and other proximate neonatal infections.
Human endogenous retrovirus (HERV) transcripts are known to be highly expressed in cancers, yet their activity in nondiseased tissue is largely unknown. Using the GTEx RNA-seq dataset from normal tissue sampled at autopsy, we characterized individual expression of the recent HERV-K (HML-2) provirus group across 13,000 different samples of 54 different tissues from 948 individuals. HML-2 transcripts could be identified in every tissue sampled and were elevated in the cerebellum, pituitary, testis, and thyroid. A total of 37 different individual proviruses were expressed in 1 or more tissues, representing all 3 LTR5 subgroups. Nine proviruses were identified as having long terminal repeat (LTR)-driven transcription, 7 of which belonged to the most recent LTR5HS subgroup. Proviruses of different subgroups displayed a bias in tissue expression, which may be associated with differences in transcription factor binding sites in their LTRs. Provirus expression was greater in evolutionarily older proviruses with an earliest shared ancestor of gorilla or older. HML-2 expression was significantly affected by biological sex in 1 tissue, while age and timing of death (Hardy score) had little effect. Proviruses containing intact gag, pro, and env open reading frames (ORFs) were expressed in the dataset, with almost every tissue measured potentially expressing at least 1 intact ORF (gag).
A wealth of viral data sits untapped in publicly available metagenomic data sets when it might be extracted to create a usable index for the virological research community. We hypothesized that work of this complexity and scale could be done in a hackathon setting. Ten teams comprised of over 40 participants from six countries, assembled to create a crowd-sourced set of analysis and processing pipelines for a complex biological data set in a three-day event on the San Diego State University campus starting 9 January 2019. Prior to the hackathon, 141,676 metagenomic data sets from the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) were pre-assembled into contiguous assemblies (contigs) by NCBI staff. During the hackathon, a subset consisting of 2953 SRA data sets (approximately 55 million contigs) was selected, which were further filtered for a minimal length of 1 kb. This resulted in 4.2 million (Mio) contigs, which were aligned using BLAST against all known virus genomes, phylogenetically clustered and assigned metadata. Out of the 4.2 Mio contigs, 360,000 contigs were labeled with domains and an additional subset containing 4400 contigs was screened for virus or virus-like genes. The work yielded valuable insights into both SRA data and the cloud infrastructure required to support such efforts, revealing analysis bottlenecks and possible workarounds thereof. Mainly: (i) Conservative assemblies of SRA data improves initial analysis steps; (ii) existing bioinformatic software with weak multithreading/multicore support can be elevated by wrapper scripts to use all cores within a computing node; (iii) redesigning existing bioinformatic algorithms for a cloud infrastructure to facilitate its use for a wider audience; and (iv) a cloud infrastructure allows a diverse group of researchers to collaborate effectively. The scientific findings will be extended during a follow-up event. Here, we present the applied workflows, initial results, and lessons learned from the hackathon.
Postinfectious hydrocephalus (PIH), often following neonatal sepsis, is the most common cause of pediatric hydrocephalus world-wide, yet the microbial pathogens remain uncharacterized. Characterization of the microbial agents causing PIH would lead to an emphasis shift from surgical palliation of cerebrospinal fluid (CSF) accumulation to prevention. We examined blood and CSF from 100 consecutive cases of PIH and control cases of non-postinfectious hydrocephalus (NPIH) in infants in Uganda. Genomic testing was undertaken for bacterial, fungal, and parasitic DNA, DNA and RNA sequencing for viral identification, and extensive bacterial culture recovery. We uncovered a major contribution to PIH from Paenibacillus, upon a background of frequent cytomegalovirus (CMV) infection. CMV was only found in CSF in PIH cases. A facultatively anaerobic isolate was recovered. Assembly of the genome revealed a strain of P. thiaminolyticus. In mice, this isolate designated strain Mbale, was lethal in contrast with the benign reference strain. These findings point to the value of an unbiased pan-microbial approach to characterize PIH in settings where the organisms remain unknown, and enables a pathway towards more optimal treatment and prevention of the proximate neonatal infections.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.