baerhunter: An <i>R</i> package for the discovery and analysis of expressed non-coding regions in bacterial RNA-seq data

Ozuna, Alina; Liberto, Doriano; Joyce, Rosanna; Arnvig, Kristine B.; Nobeli, Irene

doi:10.1101/612937

Cited by 4 publications

(8 citation statements)

References 17 publications

(7 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Additionally, fine‐tuning cut‐off parameters to distinguish signal from noise is ultimately still up to the user. Somewhat surprisingly, the added complexity of such methods does not always translate into more accurate results: in limited comparisons between methods that use additional information and our own simpler, signal‐only based method, we found that our naïve approach performs comparatively well, most likely because more sophisticated methods often require more tuning of their parameters to take advantage of their added complexity (Ozuna et al., 2019). As the responsibility of parameter tuning is left up to the user, it is obvious that methods with fewer parameters, such as Rockhopper, baerhunter or APERO, may be less error‐prone and, ultimately, more appealing, especially to non‐computational users looking for quick and easy to implement solutions.…”

Section: Completing the Non‐coding Transcript Atlas: Computational Predictions From Genomic And Transcriptomic Datamentioning

confidence: 92%

“…Progress in the field, and an easy comparison between approaches, has been hindered by the fact that few of the labs publishing computational predictions have made their code readily available. In response to this challenge, several groups have created publicly available prediction programs or workflows such as Rockhopper (McClure et al., 2013), DETR’PROK (Toffano‐Nioche et al., 2013), ANNOgesic (Yu et al., 2018), APERO (Leonard et al., 2019), and baerhunter (Ozuna et al., 2019). Users of all of these transcriptomics‐based methods are required to set thresholds for separating background noise (whatever its origin) from signal in the data.…”

Section: Completing the Non‐coding Transcript Atlas: Computational Predictions From Genomic And Transcriptomic Datamentioning

confidence: 99%

“…To eliminate guesswork by the user to adjust for noise versus signal, the program normalizes for read counts using the upper quartile of non‐zero gene expression values and generates a transcriptional map of the predicted non‐coding elements. Baerhunter (Ozuna et al., 2019) and APERO (Leonard et al., 2019) are lighter tools to install, both written in R and requiring only the most commonly used BAM format alignment files and relevant reference annotations. Like Rockhopper, the output of baerhunter is a transcriptional map (in .gff format) and can consolidate annotations from multiple samples.…”

Section: Completing the Non‐coding Transcript Atlas: Computational Predictions From Genomic And Transcriptomic Datamentioning

confidence: 99%

See 2 more Smart Citations

Challenges in defining the functional, non‐coding, expressed genome of members of the Mycobacterium tuberculosis complex

Stiens

Arnvig

Kendall

et al. 2021

Molecular Microbiology

Self Cite

View full text Add to dashboard Cite

A definitive transcriptome atlas for the non‐coding expressed elements of the members of the Mycobacterium tuberculosis complex (MTBC) does not exist. Incomplete lists of non‐coding transcripts can be obtained for some of the reference genomes (e.g., M. tuberculosis H37Rv) but to what extent these transcripts have homologues in closely related species or even strains is not clear. This has implications for the analysis of transcriptomic data; non‐coding parts of the transcriptome are often ignored in the absence of formal, reliable annotation. Here, we review the state of our knowledge of non‐coding RNAs in pathogenic mycobacteria, emphasizing the disparities in the information included in commonly used databases. We then proceed to review ways of combining computational solutions for predicting the non‐coding transcriptome with experiments that can help refine and confirm these predictions.

show abstract

Section: Completing the Non‐coding Transcript Atlas: Computational Predictions From Genomic And Transcriptomic Datamentioning

confidence: 92%

Section: Completing the Non‐coding Transcript Atlas: Computational Predictions From Genomic And Transcriptomic Datamentioning

confidence: 99%

Section: Completing the Non‐coding Transcript Atlas: Computational Predictions From Genomic And Transcriptomic Datamentioning

confidence: 99%

See 1 more Smart Citation

Challenges in defining the functional, non‐coding, expressed genome of members of the Mycobacterium tuberculosis complex

Stiens

Arnvig

Kendall

et al. 2021

Molecular Microbiology

Self Cite

View full text Add to dashboard Cite

show abstract

“…Each dataset was run through the R-package, baerhunter (Ozuna et al, 2019), using the ‘ feature_file_editor’ function optimised to the most appropriate parameters for the sequencing depth (https://github.com/jenjane118/mtb_modules). ‘ Count_features’ and ‘ tpm_norm_flagging’ functions were used for transcript quantification and to identify low expression hits (less than or equal to 10 transcripts per million) in each dataset, which were subsequently eliminated.…”

Section: Methodsmentioning

confidence: 99%

Using a Whole Genome Co-expression Network to Inform the Functional Characterisation of Predicted Genomic Elements fromMycobacterium tuberculosisTranscriptomic Data

Stiens

Tan

Joyce

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

A whole genome co-expression network was created using Mycobacterium tuberculosis transcriptomic data from publicly available RNA-sequencing experiments covering a wide variety of experimental conditions. The network includes expressed regions with no formal annotation, including putative short RNAs and untranslated regions of expressed transcripts, along with the protein-coding genes. These unannotated expressed transcripts were among the best-connected members of the module sub-networks, making up more than half of the "hub" elements in modules that include protein-coding genes known to be part of regulatory systems involved in stress response and host adaptation. This dataset provides a valuable resource for investigating the role of non-coding RNA, and conserved hypothetical proteins, in transcriptomic remodelling. Based on their connections to genes with known functional groupings and correlations with replicated host conditions, predicted expressed transcripts can be screened as suitable candidates for further experimental validation.

show abstract

“…We used the Spearman method to determine the correlation between GPX1 and immune gene mark. Statistical analysis was performed with SPSS software 26.0 (SPSS Inc.), R software v3.6.3 ( http://www.r-project.org/ ) [ 27 ], and Prism 8 (GraphPad Software, Inc). Data were considered significant at P <0.05.…”

Section: Methodsmentioning

confidence: 99%

The Prognostic Role of Glutathione Peroxidase 1 and Immune Infiltrates in Glioma Investigated Using Public Datasets

Luo

Huang

et al. 2020

Med Sci Monit

View full text Add to dashboard Cite

baerhunter: An R package for the discovery and analysis of expressed non-coding regions in bacterial RNA-seq data

Cited by 4 publications

References 17 publications

Challenges in defining the functional, non‐coding, expressed genome of members of the Mycobacterium tuberculosis complex

Challenges in defining the functional, non‐coding, expressed genome of members of the Mycobacterium tuberculosis complex

Using a Whole Genome Co-expression Network to Inform the Functional Characterisation of Predicted Genomic Elements fromMycobacterium tuberculosisTranscriptomic Data

The Prognostic Role of Glutathione Peroxidase 1 and Immune Infiltrates in Glioma Investigated Using Public Datasets

Contact Info

Product

Resources

About