Memory-Optimised Parallel Processing of Hi-C Data

Drocco, Maurizio; Misale, Claudia; Pezzi, Guilherme Peretti; Tordini, Fabio; Aldinucci, Marco

doi:10.1109/pdp.2015.63

Cited by 3 publications

(6 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For what it concerns the comparison between the C++ application and the combined R with C++ package, they report substantially similar behaviours: the graph construction execution is strongly affected by datasets size and resolution, that determine the "search space" for the BFSlike graph construction and the overall memory load. Reducing the working set ameliorates execution times and overall scalability with NuChart-II, and clearly helps in obtaining good performance when offloading the graph construction from R to C++ [3]. Figure 4 compares execution time (left) and speedup (right) in the two approaches: Figures 4a and 4b show the performance for constructing a graph at level 1 starting from the KRAB cluster of genes using Dixon's SRR400266 experiment as Hi-C dataset.…”

Section: Methodsmentioning

confidence: 99%

“…Furthermore, the coupled usage of C++ with advanced techniques of parallel computing (such as lock-free algorithms and memoryaffinity) strengthens genomic research, because it makes possible to process much faster, much more data: informative results can be achieved to an unprecedented degree [3].…”

Section: Scientific Backgroundmentioning

confidence: 99%

“…Named("Start2") = s.getStart2(), phases have been thoroughly explained in our previous works [3,5,6], and we refer to those writings for a thorough explanation. Not much changes when we offload the a computation from R to C++: the very same logic is used and the ParallelFor skeleton permits to speed up both phases in a seamless way.…”

Section: Nuchart and Rcppmentioning

confidence: 99%

See 2 more Smart Citations

NuchaRt: Embedding High-Level Parallel Computing in R for Augmented Hi-C Data Analysis

Tordini¹,

Merelli

Lió

et al. 2016

Computational Intelligence Methods for Bioinformatics and Biostatistics

Self Cite

View full text Add to dashboard Cite

Abstract. Recent advances in molecular biology and Bioinformatics techniques brought to an explosion of the information about the spatial organisation of the DNA in the nucleus. High-throughput chromosome conformation capture techniques provide a genome-wide capture of chromatin contacts at unprecedented scales, which permit to identify physical interactions between genetic elements located throughout the human genome. These important studies are hampered by the lack of biologists-friendly software. In this work we present NuchaRt, an R package that wraps NuChart-II, an efficient and highly optimized C++ tool for the exploration of Hi-C data. By rising the level of abstraction, NuchaRt proposes a high-performance pipeline that allows users to orchestrate analysis and visualisation of multi-omics data, making optimal use of the computing capabilities offered by modern multi-core architectures, combined with the versatile and well known R environment for statistical analysis and data visualisation.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Scientific Backgroundmentioning

confidence: 99%

See 1 more Smart Citation

NuchaRt: Embedding High-Level Parallel Computing in R for Augmented Hi-C Data Analysis

Tordini¹,

Merelli

Lió

et al. 2016

Computational Intelligence Methods for Bioinformatics and Biostatistics

Self Cite

View full text Add to dashboard Cite

show abstract

“…NuChart-II has been designed using high-level parallel programming patterns, that facilitate the implementation of the algorithms employed over the graph: this choice permits to boost performances while conducting genome-wide analysis of the DNA. Furthermore, the coupled usage of C++ with advanced techniques of parallel computing (such as lock-free algorithms and memoryaffinity) strengthens genomic research, because it makes possible to process much faster, much more data: informative results can be achieved to an unprecedented degree [3].…”

Section: Scientific Backgroundmentioning

confidence: 99%

“…Both of them are suitable for being revisited in the context of loop parallelism, since their kernels can be run concurrently on multiple processors with no data dependencies involved. These phases have been thoroughly explained in our previous works [3,5,6], and we refer to those writings for a thorough explanation. Not much changes when we offload the a computation from R to C++: the very same logic is used and the ParallelFor skeleton permits to speed up both phases in a seamless way.…”

Section: Nuchart and Rcppmentioning

confidence: 99%

Computational Intelligence Methods for Bioinformatics and Biostatistics

Angelini¹,

Rancoita²,

Rovetta³

2016

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Recent advances in molecular biology and Bioinformatics techniques brought to an explosion of the information about the spatial organisation of the DNA in the nucleus. High-throughput chromosome conformation capture techniques provide a genome-wide capture of chromatin contacts at unprecedented scales, which permit to identify physical interactions between genetic elements located throughout the human genome. These important studies are hampered by the lack of biologists-friendly software. In this work we present NuchaRt, an R package that wraps NuChart-II, an efficient and highly optimized C++ tool for the exploration of Hi-C data. By rising the level of abstraction, NuchaRt proposes a high-performance pipeline that allows users to orchestrate analysis and visualisation of multi-omics data, making optimal use of the computing capabilities offered by modern multi-core architectures, combined with the versatile and well known R environment for statistical analysis and data visualisation.

show abstract

How computer science can help in understanding the 3D genome architecture

Shavit¹,

Merelli²,

Milanesi³

et al. 2015

Brief Bioinform

View full text Add to dashboard Cite

Chromosome conformation capture techniques are producing a huge amount of data about the architecture of our genome. These data can provide us with a better understanding of the events that induce critical regulations of the cellular function from small changes in the three-dimensional genome architecture. Generating a unified view of spatial, temporal, genetic and epigenetic properties poses various challenges of data analysis, visualization, integration and mining, as well as of high performance computing and big data management. Here, we describe the critical issues of this new branch of bioinformatics, oriented at the comprehension of the three-dimensional genome architecture, which we call 'Nucleome Bioinformatics', looking beyond the currently available tools and methods, and highlight yet unaddressed challenges and the potential approaches that could be applied for tackling them. Our review provides a map for researchers interested in using computer science for studying 'Nucleome Bioinformatics', to achieve a better understanding of the biological processes that occur inside the nucleus.

show abstract

Memory-Optimised Parallel Processing of Hi-C Data

Cited by 3 publications

References 12 publications

NuchaRt: Embedding High-Level Parallel Computing in R for Augmented Hi-C Data Analysis

NuchaRt: Embedding High-Level Parallel Computing in R for Augmented Hi-C Data Analysis

Computational Intelligence Methods for Bioinformatics and Biostatistics

How computer science can help in understanding the 3D genome architecture

Contact Info

Product

Resources

About