Developing new software tools for analysis of large-scale biological data is a key component of advancing modern biomedical research. Scientific reproduction of published findings requires running computational tools on data generated by such studies, yet little attention is presently allocated to the installability and archival stability of computational software tools. Scientific journals require data and code sharing, but none currently require authors to guarantee the continuing functionality of newly published tools. We have estimated the archival stability of computational biology software tools by performing an empirical analysis of the internet presence for 36,702 omics software resources published from 2005 to 2017. We found that almost 28% of all resources are currently not accessible through uniform resource locators (URLs) published in the paper they first appeared in. Among the 98 software tools selected for our installability test, 51% were deemed “easy to install,” and 28% of the tools failed to be installed at all because of problems in the implementation. Moreover, for papers introducing new software, we found that the number of citations significantly increased when authors provided an easy installation process. We propose for incorporation into journal policy several practical solutions for increasing the widespread installability and archival stability of published bioinformatics software.
RNA hybridization‐based spatial transcriptomics provides unparalleled detection sensitivity. However, inaccuracies in segmentation of image volumes into cells cause misassignment of mRNAs which is a major source of errors. Here, we develop JSTA, a computational framework for joint cell segmentation and cell type annotation that utilizes prior knowledge of cell type‐specific gene expression. Simulation results show that leveraging existing cell type taxonomy increases RNA assignment accuracy by more than 45%. Using JSTA, we were able to classify cells in the mouse hippocampus into 133 (sub)types revealing the spatial organization of CA1, CA3, and Sst neuron subtypes. Analysis of within cell subtype spatial differential gene expression of 80 candidate genes identified 63 with statistically significant spatial differential gene expression across 61 (sub)types. Overall, our work demonstrates that known cell type expression patterns can be leveraged to improve the accuracy of RNA hybridization‐based spatial transcriptomics while providing highly granular cell (sub)type information. The large number of newly discovered spatial gene expression patterns substantiates the need for accurate spatial transcriptomic measurements that can provide information beyond cell (sub)type labels.
Background: Recent advancements in next-generation sequencing have rapidly improved our ability to study genomic material at an unprecedented scale. Despite substantial improvements in sequencing technologies, errors present in the data still risk confounding downstream analysis and limiting the applicability of sequencing technologies in clinical tools. Computational error correction promises to eliminate sequencing errors, but the relative accuracy of error correction algorithms remains unknown. Results: In this paper, we evaluate the ability of error correction algorithms to fix errors across different types of datasets that contain various levels of heterogeneity. We highlight the advantages and limitations of computational error correction techniques across different domains of biology, including immunogenomics and virology. To demonstrate the efficacy of our technique, we apply the UMI-based high-fidelity sequencing protocol to eliminate sequencing errors from both simulated data and the raw reads. We then perform a realistic evaluation of error-correction methods. Conclusions: In terms of accuracy, we find that method performance varies substantially across different types of datasets with no single method performing best on all types of examined data. Finally, we also identify the techniques that offer a good balance between precision and sensitivity.
Stellate ganglion neurons, important mediators of cardiopulmonary neurotransmission, are surrounded by satellite glial cells (SGCs), which are essential for the function, maintenance, and development of neurons. However, it remains unknown whether SGCs in adult sympathetic ganglia exhibit any functional diversity, and what role this plays in modulating neurotransmission. We performed single‐cell RNA sequencing of mouse stellate ganglia (n = 8 animals), focusing on SGCs (n = 11,595 cells). SGCs were identified by high expression of glial‐specific transcripts, S100b and Fabp7. Microglia and Schwann cells were identified by expression of C1qa/C1qb/C1qc and Ncmap/Drp2, respectively, and excluded from further analysis. Dimensionality reduction and clustering of SGCs revealed six distinct transcriptomic subtypes, one of which was characterized the expression of pro‐inflammatory markers and excluded from further analyses. The transcriptomic profiles and corresponding biochemical pathways of the remaining subtypes were analyzed and compared with published astrocytic transcriptomes. This revealed gradual shifts of developmental and functional pathways across the subtypes, originating from an immature and pluripotent subpopulation into two mature populations of SGCs, characterized by upregulated functional pathways such as cholesterol metabolism. As SGCs aged, these functional pathways were downregulated while genes and pathways associated with cellular stress responses were upregulated. These findings were confirmed and furthered by an unbiased pseudo‐time analysis, which revealed two distinct trajectories involving the five subtypes that were studied. These findings demonstrate that SGCs in mouse stellate ganglia exhibit transcriptomic heterogeneity along maturation or differentiation axes. These subpopulations and their unique biochemical properties suggest dynamic physiological adaptations that modulate neuronal function.
Developing new software tools for analysis of large-scale biological data is a key component of advancing modern biomedical research. Scientific reproduction of published findings requires running computational tools on data generated by such studies, yet little attention is presently allocated to the installability and archival stability of computational software tools. Scientific journals require data and code sharing, but none currently require authors to guarantee the continuing functionality of newly published tools. We have estimated the archival stability of computational biology software tools by performing an empirical analysis of the internet presence for 36,702 omics software resources published from 2005 to 2017. We found that almost 28% of all resources are currently not accessible through URLs published in the paper they first appeared in. Among the 98 software tools selected for our installability test, 51%were deemed "easy to install ," and 28% of the tools failed to be installed at all due to problems in the implementation. Moreover, for papers introducing new software, we found that the number of citations significantly increased when authors provided an easy installation process.We propose for incorporation into journal policy several practical solutions for increasing the widespread installability and archival stability of published bioinformatics software.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.