Permuting data on random-access block storage

Thonangi, Risi; Yang, Jun

doi:10.14778/2536360.2536371

Cited by 4 publications

(1 citation statement)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Shao et al [10] avoid rotational latency and minimize the access time of neighboring dataset blocks by using a data placement strategy providing efficient semi-sequential accesses along the outer dimensions of a multidimensional array. Thonangi and Yang [11] exploit SSD characteristics such as efficient random access and asymmetry between read and write operation performance to address general data permutations. Since it manages I/O operations explicitly, this solution depends on the SSD characteristics to be efficient, while we exploit the operating system for a better performance portability.…”

Section: Related Workmentioning

confidence: 99%

Efficient Out-of-Core and Out-of-Place Rectangular Matrix Transposition and Rotation

Godard

Loechner

Bastoul

2021

IEEE Trans. Comput.

View full text Add to dashboard Cite

Modern computers keep following the traditional model of addressing memory linearly for their main memory and out-of-core storage. While this model allows efficient row access to row-major 2D matrices, it introduces complexity to perform efficient column access. A common strategy to improve these accesses is to transpose or rotate the matrix beforehand, thus the accessing complexity is centralized in one transformation operation. Further column accesses are performed as row accesses to the transposed matrix therefore they are optimized to the memory model. In this paper, we propose an efficient solution to perform in-memory or out-of-core rectangular matrix transposition and rotation by using an out-of-place strategy, reading a matrix from an input file and writing the transformed matrix to another (output) file. An originality of our processing algorithm is to rely on an optimized use of the page cache mechanism. It is parallel, optimized by several levels of tiling and independent of any disk block size. We evaluate our approach on five common storage configurations: HDD, hybrid HDD-SSD, SSD, software RAID 0 of several SSDs and NVMe. We show that it brings significant performance improvement over a hand-tuned optimized reference implementation developed by the Caldera company and we confront it against the baseline speed of a straight file copy.

show abstract

Section: Related Workmentioning

confidence: 99%