Alin Jula scite author profile

Alin Jula

5Publications

118Citation Statements Received

55Citation Statements Given

How they've been cited

128

118

How they cite others

Affiliations

Mitchell Institute, Texas A&M University

Publications

Order By: Most citations

STAPL: An Adaptive, Generic Parallel C++ Library

Jula

Rus

et al. 2003

View full text Add to dashboard Cite

Abstract. The Standard Template Adaptive Parallel Library (STAPL) is a parallel library designed as a superset of the ANSI C++ Standard Template Library (STL). It is sequentially consistent for functions with the same name, and executes on uni-or multi-processor systems that utilize shared or distributed memory. STAPL is implemented using simple parallel extensions of C++ that currently provide a SPMD model of parallelism, and supports nested parallelism. The library is intended to be general purpose, but emphasizes irregular programs to allow the exploitation of parallelism for applications which use dynamically linked data structures such as particle transport calculations, molecular dynamics, geometric modeling, and graph algorithms. STAPL provides several different algorithms for some library routines, and selects among them adaptively at runtime. STAPL can replace STL automatically by invoking a preprocessing translation phase. In the applications studied, the performance of translated code was within 5% of the results obtained using STAPL directly. STAPL also provides functionality to allow the user to further optimize the code and achieve additional performance gains. We present results obtained using STAPL for a molecular dynamics code and a particle transport code. MotivationIn sequential computing, standardized libraries have proven to be valuable tools for simplifying the program development process by providing routines for common operations that allow programmers to concentrate on higher level problems. Similarly, libraries of elementary, generic, parallel algorithms provide important building blocks for parallel applications and specialized libraries [7,6,20]. Due to the added complexity of parallel programming, the potential impact of libraries could be even more profound than for sequential computing. Indeed, we believe parallel libraries are crucial for moving parallel computing into the mainstream since they offer the only viable means for achieving scalable performance across a variety of applications and architectures with programming efforts comparable to those of developing sequential codes. In particular, properly designed parallel libraries could insulate less experienced users

show abstract

WIQ: Work-Intensive Query Scheduling for In-Memory Database Systems

Kraft

Casale

Jula³

et al. 2012

View full text Add to dashboard Cite

Architectural support for parallel reductions in scalable shared-memory multiprocessors

Garzarán¹,

Prvulovic²,

Zhang³

et al.

View full text Add to dashboard Cite

Reductions are important and time-consuming operations in many scientific codes. Effective parallelization of reductions is a critical transformation for loop parallelization, especially for sparse, dynamic applications. Unfortunately, conventional reduction parallelization algorithms are not scalable.In this paper, we present new architectural support that significantly speeds-up parallel reduction and makes it scalable in shared-memory multiprocessors. The required architectural changes are mostly confined to the directory controllers. Experimental results based on simulations show that the proposed support is very effective. While conventional software-only reduction parallelization delivers average speedups of only 2.7 for 16 processors, our scheme delivers average speedups of 7.6.

show abstract

Two memory allocators that use hints to improve locality

Jula¹,

Rauchwerger

2009

View full text Add to dashboard Cite

Dynamic memory allocators affect an application's performance through their data layout quality. They can use an application's allocation hints to improve the spatial locality of this layout. However, a practical approach needs to be automatic, without user intervention. In this paper we present two locality improving allocators, that use allocation hints provided automatically from the C++ STL library to improve an application's spatial locality. When compared to state-of-the-art allocators on seven real world applications, our allocators run on average 7% faster than the Lea allocator, and 17% faster than the FreeBSD's allocator, with the same memory fragmentation as the Lea allocator, one of the best allocators.While considering locality as an important goal, locality improving allocators must not abandon the existing constraints of fast allocation speed and low fragmentation. These constraints further challenge their design and implementation. We experimentally show that within a memory allocator, allocation speed, memory fragmentation, and spatial locality compete with each other in a game of rock, paper, scissors: when one tries to improve one trait, the others suffer. We conclude that our allocators achieve a good balance of these traits, and they can easily be adjusted to optimize the most important trait for each application.

show abstract

Custom Memory Allocation for Free

Jula

Rauchwerger

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alin Jula

STAPL: An Adaptive, Generic Parallel C++ Library

WIQ: Work-Intensive Query Scheduling for In-Memory Database Systems

Architectural support for parallel reductions in scalable shared-memory multiprocessors

Two memory allocators that use hints to improve locality

Custom Memory Allocation for Free

Contact Info

Product

Resources

About