Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis 2019
DOI: 10.1145/3295500.3356149
|View full text |Cite
|
Sign up to set email alerts
|

High performance Monte Carlo simulation of ising model on TPU clusters

Abstract: Large-scale deep learning benefits from an emerging class of AI accelerators. Some of these accelerators' designs are general enough for compute-intensive applications beyond AI and Cloud TPU is one such example. In this paper, we demonstrate a novel approach using TensorFlow on Cloud TPU to simulate the two-dimensional Ising Model. TensorFlow and Cloud TPU framework enable the simple and readable code to express the complicated distributed algorithm without compromising the performance. Our code implementatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
39
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
7
3

Relationship

1
9

Authors

Journals

citations
Cited by 37 publications
(39 citation statements)
references
References 16 publications
0
39
0
Order By: Relevance
“…The ability to match DNS with a 10 × coarser grid makes the learned interpolation solver much faster. We benchmark our solver on a single core of Google’s Cloud TPU v4, a hardware accelerator designed for accelerating ML models that is also suitable for many scientific computing use cases ( 45 – 47 ). The TPU is designed for high-throughput vectorized operations, with extremely high throughput matrix–matrix multiplication in low precision (bfloat16).…”
Section: Resultsmentioning
confidence: 99%
“…The ability to match DNS with a 10 × coarser grid makes the learned interpolation solver much faster. We benchmark our solver on a single core of Google’s Cloud TPU v4, a hardware accelerator designed for accelerating ML models that is also suitable for many scientific computing use cases ( 45 – 47 ). The TPU is designed for high-throughput vectorized operations, with extremely high throughput matrix–matrix multiplication in low precision (bfloat16).…”
Section: Resultsmentioning
confidence: 99%
“…As an example for recent work, where the two dimensional Ising model has been simulated on a GPU, see ref. [36].…”
Section: Discussionmentioning
confidence: 99%
“…It is plausible with the following four reasons: (1) TPU is an ML application-specific integrated circuit (ASIC), devised for neural networks (NNs); NNs require massive amounts of multiplications and additions between the data and parameters and TPU can handle these computations in terms of matrix multiplications in a very efficient manner [29]; similarly, DFT can also be formulated as matrix multiplications between the input data and the Vandemonde matrix; (2) TPU chips are connected directly to each other with dedicated, high-speed, and lowlatency interconnects, bypassing host CPU or any networking resources; therefore, the large-scale DFT computation can be distributed among multiple TPUs with minimal communication time and hence very high parallel efficiency; (3) the large capacity of the in-package memory of TPU makes it possible to handle large-scale DFT efficiently; and (4) TPU is programmable with software front ends such as TensorFlow [30] and PyTorch [31], both of which make it straightforward to implement the parallel algorithms of DFT on TPUs. In fact, all the aforementioned four reasons have been verified in the high-performance Monte Carlo simulations on TPUs [32], [33].…”
Section: Introductionmentioning
confidence: 64%