A heterogeneous supercomputer model for high-performance parallel computing pedagogy

Wolfer, James

doi:10.1109/educon.2015.7096063

Cited by 17 publications

(6 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The StudentParallela was intended to teach the native Epiphany programming model, Pthreads, OpenMP and MPI in a parallel computing elective course. Cu-T-Pi [16], created by James Wolfer, is composed of 1 Nvidia Jetson TK1 head node and 4 Model B+ Raspberry Pi worker nodes with 2 ARM cores each. The Cu-T-Pi was used to teach parallel programming and also to demonstrate benchmarking concepts.…”

Section: Previous and Related Workmentioning

confidence: 99%

Teaching HPC Systems and Parallel Programming with Small-Scale Clusters

Alvarez

Ayguadé

Mantovani

2018

2018 IEEE/ACM Workshop on Education for High-Performance Computing (EduHPC)

View full text Add to dashboard Cite

In the last decades, the continuous proliferation of High-Performance Computing (HPC) systems and data centers has augmented the demand for expert HPC system designers, administrators, and programmers. For this reason, most universities have introduced courses on HPC systems and parallel programming in their degrees. However, the laboratory assignments of these courses generally use clusters that are owned, managed and administrated by the university. This methodology has been shown effective to teach parallel programming, but using a remote cluster prevents the students from experimenting with the design, set up and administration of such systems. This paper presents a methodology and framework to teach HPC systems and parallel programming using a small-scale cluster of single-board computers. These boards are very cheap, their processors are fundamentally very similar to the ones found in HPC, and they are ready to execute Linux out of the box. So they represent a perfect laboratory playground for students experiencing how to assemble a cluster, setting it up, and configuring its system software. Also, we show that these small-scale clusters can be used as evaluation platforms for both, introductory and advanced parallel programming assignments.

show abstract

Section: Previous and Related Workmentioning

confidence: 99%

Teaching HPC Systems and Parallel Programming with Small-Scale Clusters

Alvarez

Ayguadé

Mantovani

2018

2018 IEEE/ACM Workshop on Education for High-Performance Computing (EduHPC)

View full text Add to dashboard Cite

show abstract

“…Created by James Wolfer [48] at Indiana University, South Bend (IUSB), this cluster is designed to be highly visible, portable, and have hardware and software architecture consistent with contemporary heterogeneous systems. The cluster consists of four Model B+ Raspberry Pi worker nodes and one Nvidia Jetson Tk-1 head node, connected through a gigabit Ethernet switch, all mounted in a terraced arrangement for instructional visibility.…”

Section: Cu-t-pimentioning

confidence: 99%

“…By using the HPL benchmark adapted for the Raspberry Pi [38,15], we can observe and quantify the impact of asymmetric communication speeds. Details can be found in [48].…”

Section: Cu-t-pimentioning

confidence: 99%

Using Inexpensive Microclusters and Accessible Materials for Cost-Effective Parallel and Distributed Computing Education

Adams¹,

Matthews²,

Shoop³

et al. 2017

JOCSE

Self Cite

View full text Add to dashboard Cite

With parallel and distributed computing (PDC) now in the core CS curriculum, CS educators are building new pedagogical tools to teach their students about this cutting-edge area of computing. In this paper, we present an innovative approach we call microclusters -personal, portable Beowulf clusters -that provide students with hands-on PDC learning experiences. We present several different microclusters, each built using a different combination of single board computers (SBCs) as its compute nodes, including various ODROID models, Nvidia's Jetson TK1, Adapteva's Parallella, and the Raspberry Pi. We explore different ways that CS educators are using these systems in their teaching, and describe specific courses in which CS educators have used microclusters. Finally, we present an overview of sources of free PDC pedagogical materials that can be used with microclusters.

show abstract

“…1 Moreover, TK1 has several advantages, such as the low cost, low power consumption and high applicability, by comparing with other embedded platforms, 2 , 3 desktop CPUs, 1 , 4 and desktop GPU cards. 4 For example, Wolfer 3 presented the results that a single TK1 outperforms a Raspberry Pi Model in terms of time and speedup ratios; Paolucci et al 1 proved that the dual-socket node connected by TK1s achieves 14.4 times better than that by SuperMicro server with Intel XEON E5620 CPUs for the power consumption; and Fu et al 4 showed the results that the power efficiency and cost efficiency by a single TK1 are both better than those by an Intel i7-3770 CPU and a NVIDIA GTX 690 GPU card. Hence, it becomes a new research direction to study TK1s in several specific applications, such as the surveillance, bioinformatics, and image processing.…”

Section: Introductionmentioning

confidence: 99%

Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments

Wei

Cheng

Lin

et al. 2017

Evol Bioinform Online

View full text Add to dashboard Cite

High-end graphics processing units (GPUs), such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1), which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs). Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform) was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform). Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.

show abstract

A heterogeneous supercomputer model for high-performance parallel computing pedagogy

Cited by 17 publications

References 6 publications

Teaching HPC Systems and Parallel Programming with Small-Scale Clusters

Teaching HPC Systems and Parallel Programming with Small-Scale Clusters

Using Inexpensive Microclusters and Accessible Materials for Cost-Effective Parallel and Distributed Computing Education

Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments

Contact Info

Product

Resources

About