Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems

Brunner

et al. 2023

Preprint

The present study presents an alternative analytical workflow that combines mid-infrared (MIR) microscopic imaging and deep learning to diagnose human lymphoma and differentiate between small and large cell lymphoma. We could show that using a deep learning approach to analyze MIR hyperspectral data obtained from benign and malignant lymph node pathology results in high accuracy for correct classification, learning the distinct region of 3900 cm-1 to 850 cm-1. The accuracy is above 95% for every pair of malignant lymphoid tissue and still above 90% for the distinction between benign and malignant lymphoid tissue for binary classification. These results demonstrate that a preliminary diagnosis and subtyping of human lymphoma could be streamlined by applying a deep learning approach to analyze MIR spectroscopic data.

Section: Discussionmentioning

confidence: 99%

Deep learning analysis of mid-infrared microscopic imaging data for the diagnosis and classification of human lymphomas

Brunner

et al. 2023

Preprint

“…Many GPU clusters are used to combine the power of several GPUs in order to complete a single calculation, or training of a single neural network. Others have used these systems for specific problems such as protein folding, image analysis, and training large neural networks [16,17,18]. In our particular case, we required a system which uses a limited number of GPUs (6) to serve dozens of students and faculty.…”

Section: System Requirementsmentioning

confidence: 99%

Literature Review and Implementation Overview: High Performance Computing with Graphics Processing Units for Classroom and Research Use

George

2020

Preprint

In this report, I discuss the history and current state of GPU HPC systems. Although high-power GPUs have only existed a short time, they have found rapid adoption in deep learning applications. I also discuss an implementation of a commodity-hardware NVIDIA GPU HPC cluster for deep learning research and academic teaching use. and GPU HPC HistoryHigh performance computing (HPC) is typically characterized by large amounts of memory and processing power. HPC, sometimes also called supercomputing, has been around since the 1960s with the introduction of the CDC STAR-100, and continues to push the limits of computing power and capabilities for large-scale problems [1,2]. However, use of graphics processing unit (GPU) in HPC supercomputers has only started in the mid to late 2000s [3,4]. Although graphics processing chips have been around since the 1970s, GPUs were not widely used for computations until the 2000s. During the early 2000s, GPU clusters began to appear for HPC applications. Most of these clusters were designed to run large calculations requiring vast computing power, and many clusters are still designed for that purpose [5].GPUs have been increasingly used for computations due to their commodification, following Moore's Law (demonstrated in Figure 1), and usage in specific applications like neural networks. Although server-grade GPUs can be used in clusters, commodity-grade GPUs are much more cost-effective. A similar amount of computing power with commodity hardware can be obtained for roughly a third of the cost of server-grade hardware. In 2018 NVIDIA suddenly forced businesses to replace commodity GPUs with their server-grade GPUs in what appeared to be primarily motivated by a desire to increase earnings, but may have been related to warranty issues as well [6]. However, commodity hardware still proves to be useful for GPU clusters [7,8,9], especially in academic settings where the NVIDIA EULA does not seem to apply. Several studies have examined the performance of commodity [10,11] and non-commodity [12] GPU hardware for various calculations, and generally found commodity hardware to be suitable for use in GPU clusters. Although NVIDIA's legal definitions in their EULA are intentionally vague, it seems that using commodity NVIDIA GPUs and the associated NVIDIA drivers/software is allowed for smaller academic uses such as our use-case [13].Although some guidelines exist for GPU clusters [15] and openHPC has "recipes" which are instructions for installing SLURM on a CentOS or SUSE cluster, there is no good step-by-step documentation for creating a commodity GPU cluster from scratch using Ubuntu Linux. Ubuntu is currently one of the top-most used Linux distributions for both personal and server use and has a vibrant community as well as support, making Ubuntu a good choice for use as a Linux system. One drawback of Ubuntu is it is frequently updated and may not be as stable as other Linux OS's such as

“…Em [54,53,52,1] foram realizadas análises mensurando o uso de diversos algoritmos morfológicos em CPUs, GPUs e MICs. Nesses estudos foram constatados o bom desempenho da arquitetura Many Integrated Core em acessos regulares de dados, porém com o uso apenas de autovetorização por meio de anotações de diretivas no código (#pragma simd, por exemplo).…”

Section: ()unclassified

Efficient Execution of Irregular Wavefront Propagation Pattern on Many Integrated Core Architecture

Gomes¹,

Teodoro²

2016

Preprint

Self Cite

The efficient execution of image processing algorithms is an active area of Bioinformatics. In image processing, one of the classes of algorithms or computing pattern that works with irregular data structures is the Irregular Wavefront Propagation Pattern (IWPP). In this class, elements propagate information to neighbors in the form of wave propagation. This propagation results in irregular access to data and expansions. Due to this irregularity, current implementations of this class of algorithms requires atomic operations, which is very costly and also restrains implementations with Single Instruction, Multiple Data (SIMD) instructions in Many Integrated Core (MIC) architectures, which are critical to attain high performance on this processor. The objective of this study is to redesign the Irregular Wavefront Propagation Pattern algorithm in order to enable the efficient execution on processors with Many Integrated Core architecture using SIMD instructions. In this work, using the Intel ® Xeon Phi ™ coprocessor, we have implemented a vector version of IWPP with up to 5.63× gains on non-vectored version, a parallel version using First In, First Out (FIFO) queue that attained speedup up to 55× as compared to the single core version on the coprocessor, a version using priority queue whose performance was 1.62× better than the fastest version of GPU based implementation available in the literature, and a cooperative version between heterogeneous processors that allow to process images bigger than the Intel ® Xeon Phi ™ memory and also provides a way to utilize all the available devices in the computation. KeywordsIrregular Wavefront Propagation Pattern, Intel ® Xeon Phi ™ , Many Integrated Core. ResumoA execução eficiente de algoritmos de processamento de imagens é uma área ativa da Bioinformática. Uma das classes de algoritmos em processamento de imagens ou de padrão de computação comum nessa área é a Irregular Wavefront Propagation Pattern (IWPP). Nessa classe, elementos propagam informações para seus vizinhos em forma de ondas de propagação. Esse padrão de propagação resulta em acessos a dados e expansões irregulares. Por essa característica irregular, implementações paralelas atuais dessa classe de algoritmos necessitam de operações atômicas, o que acaba sendo muito custoso e também inviabiliza a implementação por meio de instruções Single Instruction, Multiple Data (SIMD) na arquitetura Many Integrated Core (MIC), que são fundamentais para atingir alto desempenho nessa arquitetura. O objetivo deste trabalho é reprojetar o algoritmo Irregular Wavefront Propagation Pattern, de forma a possibilitar sua eficiente execução em processadores com arquitetura Many Integrated Core que utilizem instruções SIMD. Neste trabalho, utilizando o Intel ® Xeon Phi ™ , foram implementadas uma versão vetorizada, apresentando ganhos de até 5.63× em relação à versão não-vetorizada;