the advent of RnA-seq technologies has switched the paradigm of genetic analysis from a genome to a transcriptome-based perspective. Alternative splicing generates functional diversity in genes, but the precise functions of many individual isoforms are yet to be elucidated. Gene Ontology was developed to annotate gene products according to their biological processes, molecular functions and cellular components. Despite a single gene may have several gene products, most annotations are not isoform-specific and do not distinguish the functions of the different proteins originated from a single gene. Several approaches have tried to automatically annotate ontologies at the isoform level, but this has shown to be a daunting task. We have developed ISOGO (ISOform + GO function imputation), a novel algorithm to predict the function of coding isoforms based on their protein domains and their correlation of expression along 11,373 cancer patients. Combining these two sources of information outperforms previous approaches: it provides an area under precision-recall curve (AUPRC) five times larger than previous attempts and the median AUROC of assigned functions to genes is 0.82. We tested ISOGO predictions on some genes with isoform-specific functions (BRCA1, MADD,VAMP7 and ITSN1) and they were coherent with the literature. Besides, we examined whether the main isoform of each gene-as predicted by APPRIS-was the most likely to have the annotated gene functions and it occurs in 99.4% of the genes. We also evaluated the predictions for isoform-specific functions provided by the CAFA3 challenge and results were also convincing. To make these results available to the scientific community, we have deployed a web application to consult ISOGO predictions (https://biotecnun.unav. es/app/isogo). Initial data, website link, isoform-specific GO function predictions and R code is available at https://gitlab.com/icassol/isogo. Alternative splicing (AS) is a genetic process by which a single pre-mRNA can originate different mature mRNAs (called isoforms or transcripts) by including or excluding exons and introns 1-4. It is estimated that genes have on average 7 transcripts, that the whole transcriptome there are more than 100,000 AS events 5,6 and that over 90% of human genes contain one or more isoforms 7-10. From a functional point of view, AS is an intriguing process. Some studies show that a large number of sporadic splicing events produce alternative isoforms lowly expressed, and thus may be non-functional noise in the transcription process 11-13. On the other hand, other studies show and experimentally validate that different isoforms originated by alternative splicing may have distinct or even opposite functions 14,15. It is known that AS can cause cellular abnormalities that lead to diverse genetic diseases. All the hallmarks of cancer have their counterpart in AS 16-18. For example, BRCA1 is a tumor suppressor gene related to breast cancer susceptibility. Its isoform originated from skipping exon 11 (that includes a RAD51 inte...
Background: Several studies have documented the significant impact of methodological choices in microbiome analyses. The myriad of methodological options available complicate the replication of results and generally limit the comparability of findings between independent studies that use differing techniques and measurement pipelines. Here we describe the Mosaic Standards Challenge (MSC), an international interlaboratory study designed to assess the impact of methodological variables on the results. The MSC did not prescribe methods but rather asked participating labs to analyze 7 shared reference samples (5x human stool samples and 2x mock communities) using their standard laboratory methods. To capture the array of methodological variables, each participating lab completed a metadata reporting sheet that included 100 different questions regarding the details of their protocol. The goal of this study was to survey the methodological landscape for microbiome metagenomic sequencing (MGS) analyses and the impact of methodological decisions on metagenomic sequencing results. Results: A total of 44 labs participated in the MSC by submitting results (16S or WGS) along with accompanying metadata; thirty 16S rRNA gene amplicon datasets and 14 WGS datasets were collected. The inclusion of two types of reference materials (human stool and mock communities) enabled analysis of both MGS measurement variability between different protocols using the biologically-relevant stool samples, and MGS bias with respect to ground truth values using the DNA mixtures. Owing to the compositional nature of MGS measurements, analyses were conducted on the ratio of Firmicutes: Bacteroidetes allowing us to directly apply common statistical methods. The resulting analysis demonstrated that protocol choices have significant effects, including both bias of the MGS measurement associated with a particular methodological choices, as well as effects on measurement robustness as observed through the spread of results between labs making similar methodological choices. In the analysis of the DNA mock communities, MGS measurement bias was observed even when there was general consensus among the participating laboratories. Conclusion: This study was the result of a collaborative effort that included academic, commercial, and government labs. In addition to highlighting the impact of different methodological decisions on MGS result comparability, this work also provides insights for consideration in future microbiome measurement study design.
Studies of microbial communities vary widely in terms of analysis methods. In this exponentially growing field, the wide variety of diversity measures and lack of consistency make it harder to compare different studies. Most existing alpha-diversity metrics are inherited from other disciplines and their assumptions are not always directly meaningful or true for microbiome data. Many existing microbiome studies apply one or some alpha diversity metrics with no fundamentals but also an unclear results interpretation. This work focuses on a theoretical, empirical, and comparative analysis of 19 frequently and less-frequently used microbial alpha-diversity metrics grouped into 4 proposed categories, including key features of every analyzed metric with their mathematical assumptions, in order to provide a deeper understanding of the existing metrics and a practical implementation guide for future studies.
AND ARÉVALO 551 design of the application. The most relevant design improvements were obtained with algorithm abstraction by applying the strategy pattern, attributes/methods relocalization, variables types generalization, and removing/renaming methods/attributes. Besides a methodology and the supporting tool, we provide 14 case studies based on real projects implemented in C, and we showed how the results validate our proposal. KEYWORDS design recovery, legacy software, object-oriented paradigm, procedural language, reengineering, refactoring, reverse engineeringIn the 1970s and 1980s, companies have invested in building software systems that improve and satisfy their commercial processes (shopping, sales, accounting, etc) and their market needs. Most systems were written especially in procedural-based languages, such as Cobol, C, Fortran, Pascal, and Clipper. Even when these languages were useful in those years for the implemented domains, the procedural paradigm used in legacy systems has been replaced mainly by object-oriented (OO) ones because this latter one offers useful mechanisms to get better design, such as abstraction, polymorphism, extensibility, and reuse.The applications implemented using procedural languages are still working, but they are already legacy code. Their main problems are that they were mostly implemented in an ad hoc way or, even without following an analysis-or design-based methodology, or with the functional decomposition principle, which leads to more unstable design than the OO one. Usually, provided documentation is not enough, and stakeholders and original developers have left the software development group, taking away implicit design knowledge that is not coded in the application itself. In addition, hardware and operating systems evolve, and some versions of compilers and executable code cannot continue working in the new environments. Due to all these problems, the uncontrolled evolution of the existing software and the new requirements (that are eventually implemented in an ad hoc way) have generated legacy code, whose structure and functionality are difficult to deal with. 1 Thus, these software systems end up as black boxes, and they can represent a technical debt for the owner and the developers themselves. Although the solution for all these problems could be the replacement or the migration of the complete systems, this action is not possible because they are working currently in the companies, and they represent their actual economic capital.Current developers use the OO paradigm to implement business applications and semiautomatic methodologies/tools to analyze and refactor them. Most of these methodologies mainly apply reverse engineering techniques to perform any maintenance task. However, these solutions are not available for procedural applications, and thus, we need to provide reconstruction methodologies that help the actual developers to understand how the systems are designed. These reconstruction methodologies should generate to the developers' different ab...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.