Biomolecular networks are often assumed to be scale-free hierarchical networks. The weighted gene co-expression network analysis (WGCNA) treats gene co-expression networks as undirected scale-free hierarchical weighted networks. The WGCNA R software package uses an Adjacency Matrix to store a network, next calculates the topological overlap matrix (TOM), and then identifies the modules (sub-networks), where each module is assumed to be associated with a certain biological function. The most time-consuming step of WGCNA is to calculate TOM from the Adjacency Matrix in a single thread. In this paper, the single-threaded algorithm of the TOM has been changed into a multi-threaded algorithm (the parameters are the default values of WGCNA). In the multi-threaded algorithm, Rcpp was used to make R call a C++ function, and then C++ used OpenMP to start multiple threads to calculate TOM from the Adjacency Matrix. On shared-memory MultiProcessor systems, the calculation time decreases as the number of CPU cores increases. The algorithm of this paper can promote the application of WGCNA on large data sets, and help other research fields to identify sub-networks in undirected scale-free hierarchical weighted networks. The source codes and usage are available at https://github.com/do-somethings-haha/multi-threaded_calculate_unsigned_TOM_from_unsigned_or_signed_Adjacency_Matrix_of_WGCNA.
Salvia miltiorrhiza (Labiatae) is an important medicinal plant in traditional Chinese medicine. Tanshinones are one of the main active components of S. miltiorrhiza. It has been found that the intraspecific variation of S. miltiorrhiza is relatively large and the content of tanshinones in its roots of different varieties is also relatively different. To investigate the molecular mechanisms that responsible for the differences among these varieties, the tanshinones content was determined and comparative transcriptomics analysis was carried out during the tanshinones accumulation stage. A total of 52,216 unigenes were obtained from the transcriptome by RNA sequencing among which 23,369 genes were differentially expressed among different varieties, and 2,016 genes including 18 diterpenoid biosynthesis-related genes were differentially expressed during the tanshinones accumulation stage. Functional categorization of the differentially expressed genes (DEGs) among these varieties revealed that the pathway related to photosynthesis, oxidative phosphorylation, secondary metabolite biosynthesis, diterpenoid biosynthesis, terpenoid backbone biosynthesis, sesquiterpenoid and triterpenoid biosynthesis are the most differentially regulated processes in these varieties. The six tanshinone components in these varieties showed different dynamic changes in tanshinone accumulation stage. In addition, combined with the analysis of the dynamic changes, 277 DEGs (including one dehydrogenase, three CYP450 and 24 transcription factors belonging to 12 transcription factor families) related to the accumulation of tanshinones components were obtained. Furthermore, the KEGG pathway enrichment analysis of these 277 DEGs suggested that there might be an interconnection between the primary metabolic processes, signaling processes and the accumulation of tanshinones components. This study expands the vision of intraspecific variation and gene regulation mechanism of secondary metabolite biosynthesis pathways in medicinal plants from the “omics” perspective.
MotivationWeighted gene co-expression network analysis (WGCNA) is an R package that can search highly related gene modules. The most time-consuming step of the whole analysis is to calculate the Topological Overlap Matrix (TOM) from the Adjacency Matrix in a single thread. This study changes it to multithreading.ResultsThis paper uses SQLite for multi-threaded data transfer between R and C++, uses OpenMP to enable multi-threading and calculates the TOM via an adjacency matrix on a Shared-memory MultiProcessor (SMP) system, where the calculation time decreases as the number of physical CPU cores increases.Availability and implementationThe source code is available at https://github.com/do-somethings-haha/fast_calculate_TOM_of_WGCNAContactchenxin@cdutcm.edu.cn
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.