Proceedings of the 17th Annual International Conference on Supercomputing 2003
DOI: 10.1145/782814.782850
|View full text |Cite
|
Sign up to set email alerts
|

Profile-guided I/O partitioning

Abstract: In the field of high performance computing there is a growing need to process large, complex datasets. Many of these applications are file-intensive workloads, performing a large number of reads from and writes to a small number of files. When executing these workloads on cluster-based systems, performance cannot scale by simply increasing the number of compute nodes. To effectively exploit parallel resources we need to parallelize file I/O. The potential impact of exploiting parallel I/O grows as the gap betw… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2003
2003
2020
2020

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 40 publications
(23 citation statements)
references
References 23 publications
0
23
0
Order By: Relevance
“…Instead of using SSD, Wang et al proposed to replicate frequently accessed data chunks at the compute nodes' local disks to reduce data access latency [28]. The frequently accessed data chunks are identified through analysis of I/O traces collected in a profiling run.…”
Section: A Approaches To Handle Unaligned Data Accessmentioning
confidence: 99%
“…Instead of using SSD, Wang et al proposed to replicate frequently accessed data chunks at the compute nodes' local disks to reduce data access latency [28]. The frequently accessed data chunks are identified through analysis of I/O traces collected in a profiling run.…”
Section: A Approaches To Handle Unaligned Data Accessmentioning
confidence: 99%
“…With data replication, some actively used data would have multiple copies on the disk and the copy that is closest to the disk head is accessed. Replication can be carried out within one disk [10], [3], [15] or across disks [33], [30]. The effectiveness of this method relies on two factors: a stable and predictable access pattern to know where to relocate or replicate data; and, a relatively small on-disk working set so that the replication overhead is not excessive.…”
Section: B Rearrangement Of On-disk Data Layout For Greater Spatial mentioning
confidence: 99%
“…These I/O characteristics prevent disk bandwidth to be fully utilized and impact I/O performance. Access patterns can be detected both statically (at compile time) [3,7,12,16] and dynamically (at runtime) [17]. In [7,16], Paek et al use Linear Memory Access Descriptors to detect memory array access patterns within loop nests.…”
Section: Introductionmentioning
confidence: 99%
“…In [7,16], Paek et al use Linear Memory Access Descriptors to detect memory array access patterns within loop nests. In our previous studies [1,17], we have used profile-directed optimization to improve both memory and disk I/O accesses. In [17], we found that disk I/O accesses exhibit very regular and highly predictable patterns.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation