The diverse and growing omics data in public domains provide researchers with a tremendous opportunity to extract hidden knowledge. However, the challenge of providing domain experts with easy access to these big data has resulted in the vast majority of archived data remaining unused. Here, we present MetaOmGraph (MOG), a free, open-source, standalone software for exploratory data analysis of massive datasets by scientific researchers. Using MOG, a researcher can interactively visualize and statistically analyze the data, in the context of its metadata. Researchers can interactively hone-in on groups of experiments or genes based on attributes such as expression values, statistical results, metadata terms, and ontology annotations. MOG's statistical tools include coexpression, differential expression, and differential correlation analysis, with permutation test-based options for significance assessments. Multithreading and indexing enable efficient data analysis on a personal computer, with no need for writing code. Data can be visualized as line charts, box plots, scatter plots, and volcano plots. A researcher can create new MOG projects from any data or analyze an existing one. An R-wrapper lets a researcher select and send smaller data subsets to R for additional analyses. A researcher can save MOG projects with a history of the exploratory progress and later reopen or share them. We illustrate MOG by case studies of large curated datasets from human cancer RNA-Seq, in which we assembled a list of novel putative biomarker genes in different tumors, and microarray and metabolomics from A. thaliana.Singh et al. | bioRχiv | July 11, 2019 | 1-18 in complex biological networks (6,8,(24)(25)(26)(27)(28). Here, we present MetaOmGraph (MOG), a software written in Java, to interactively explore and visualize large datasets. MOG overcomes the challenges posed by the size and complexity of big datasets by efficient handling of the data files, using a combination of data indexing and buffering schemes. Further, by incorporating metadata, MOG adds extra dimensions to the analyses and provides flexibility in data exploration. Researchers can explore their own data on their local machines. At any stage of the analysis, a researcher can save her/his progress. Saved MOG projects can be shared, reused, and included in publications. MOG is user-centered software, designed for all types of data, but specialized for expression data.
MATERIALS AND METHODSA. Overview. MOG is a standalone program that can run on any operating system capable of running Java (Linux, Mac and Windows). Access to MOG easy. MOG runs on the researcher's computer and thus the researcher does not need to rely on internet accessibility, and is not slowed down by the latency of transfer of big data. Furthermore, the data in a researcher's project is secure, remaining on the researcher's computer, particularly important for confidential data such as human RNA-Seq. MOG's modular structure has been carefully designed to enable developers to easily implement new s...