Summary We developed BIODICA, an integrated computational environment for application of Independent Component Analysis (ICA) to bulk and single-cell molecular profiles, interpretation of the results in terms of biological functions and correlation with metadata. The computational core is the novel Python package stabilized-ica which provides interface to several ICA algorithms, a stabilization procedure, meta-analysis and component interpretation tools. BIODICA is equipped with a user-friendly graphical user interface, allowing non-experienced users to perform the ICA-based omics data analysis. The results are provided in interactive ways, thus facilitating communication with biology experts. Availability and Implementation BIODICA is implemented in Java, Python and JavaScript. The source code is freely available on GitHub under the MIT and the GNU LGPL licenses. BIODICA is supported on all major operating systems. Url https://sysbio-curie.github.io/biodica-environment/
Background Ventricular tachycardia (VT) is a major cause of sudden cardiac death (SCD). Clinical investigations can sometimes fail to identify the underlying cause of VT and the event is classified as idiopathic (iVT). VT contributes significantly to the morbidity and mortality in patients with coronary artery disease (CAD) and dilated cardiomyopathy (DCM). Since mutations in arrhythmia-associated genes frequently determine arrhythmia susceptibility screening for disease-predisposing variants could improve VT diagnostics and prevent SCD in patients. Methods Ninety-two patients diagnosed with coronary heart disease (CHD), DCM, or iVT were included in our study. We evaluated genetic profiles and variants in known cardiac risk genes by targeted next generation sequencing (NGS) using a newly designed custom panel of 96 genes. We hypothesized that shared morphological and phenotypical features among these subgroups may have an overlapping molecular base. To our knowledge, this was the first study of the deep sequencing of 96 targeted cardiac genes in Kazakhstan. The clinical significance of the sequence variants was interpreted according to the guidelines developed by the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) in 2015. The ClinVar and Varsome databases were used to determine the variant classifications. Results Targeted sequencing and stepwise filtering of the annotated variants identified a total of 307 unique variants in 74 genes, totally 456 variants in the overall study group. We found 168 mutations listed in the Human Genome Mutation Database (HGMD) and another 256 rare/unique variants with elevated pathogenic potential. There was a predominance of high- to intermediate pathogenicity variants in LAMA2, MYBPC3, MYH6, KCNQ1, GAA, and DSG2 in CHD VT patients. Similar frequencies were observed in DCM VT, and iVT patients, pointing to a common molecular disease association. TTN, GAA, LAMA2, and MYBPC3 contained the most variants in the three subgroups which confirm the impact of these genes in the complex pathogenesis of cardiomyopathies and VT. The classification of 307 variants according to ACMG guidelines showed that nine (2.9%) variants could be classified as pathogenic, nine (2.9%) were likely pathogenic, 98 (31.9%) were of uncertain significance, 73 (23.8%) were likely benign, and 118 (38.4%) were benign. CHD VT patients carry rare genetic variants with increased pathogenic potential at a comparable frequency to DCM VT and iVT patients in genes related to sarcomere function, nuclear function, ion flux, and metabolism. Conclusions In this study we showed that in patients with VT secondary to coronary artery disease, DCM, or idiopathic etiology multiple rare mutations and clinically significant sequence variants in classic cardiac risk genes associated with cardiac channelopathies and cardiomyopathies were found in a similar pattern and at a comparable frequency.
Independent Component Analysis is a matrix factorization method for data dimension reduction. ICA has been widely applied for the analysis of transcriptomic data for blind separation of biological, environmental, and technical factors affecting gene expression. The study aimed to analyze the publicly available esophageal cancer data using the ICA for identification and comprehensive analysis of reproducible signaling pathways and molecular signatures involved in this cancer type. In this study, four independent esophageal cancer transcriptomic datasets from GEO databases were used. A bioinformatics tool « BiODICA—Independent Component Analysis of Big Omics Data» was applied to compute independent components (ICs). Gene Set Enrichment Analysis (GSEA) and ToppGene uncovered the most significantly enriched pathways. Construction and visualization of gene networks and graphs were performed using the Cytoscape, and HPRD database. The correlation graph between decompositions into 30 ICs was built with absolute correlation values exceeding 0.3. Clusters of components—pseudocliques were observed in the structure of the correlation graph. The top 1,000 most contributing genes of each ICs in the pseudocliques were mapped to the PPI network to construct associated signaling pathways. Some cliques were composed of densely interconnected nodes and included components common to most cancer types (such as cell cycle and extracellular matrix signals), while others were specific to EC. The results of this investigation may reveal potential biomarkers of esophageal carcinogenesis, functional subsystems dysregulated in the tumor cells, and be helpful in predicting the early development of a tumor.
Background High-throughput sequencing platforms generate a massive amount of high-dimensional genomic datasets that are available for analysis. Modern and user-friendly bioinformatics tools for analysis and interpretation of genomics data becomes essential during the analysis of sequencing data. Different standard data types and file formats have been developed to store and analyze sequence and genomics data. Variant Call Format (VCF) is the most widespread genomics file type and standard format containing genomic information and variants of sequenced samples. Results Existing tools for processing VCF files don’t usually have an intuitive graphical interface, but instead have just a command-line interface that may be challenging to use for the broader biomedical community interested in genomics data analysis. re-Searcher solves this problem by pre-processing VCF files by chunks to not load RAM of computer. The tool can be used as standalone user-friendly multiplatform GUI application as well as web application (https://nla-lbsb.nu.edu.kz). The software including source code as well as tested VCF files and additional information are publicly available on the GitHub repository (https://github.com/LabBandSB/re-Searcher).
The article deals with the problems of forming the language system of the Kazakh (native) language in preschool age. The importance of studying the regularities and features of the formation of components of the language system in the course of speech ontogenesis of children of early and preschool age is revealed. It is emphasized that in the course of scientific research, as close as possible to touch the speech-motor potential of each child, while ensuring not only a tolerant attitude to children's innovations and children's speech style, but also to pay maximum attention to the natural process of mastering the native language and the features of thestructure of the speech-language mechanism.The authors identified effective ways to study and explain the ontogenesis of speech –the process of learning the native language by children. Scientific approaches to the consideration of language phenomena inthe child's speech and the prospects for conducting ontolinguistic research are also identified.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.