In this article, we present a novel approach for block-structured adaptive mesh refinement (AMR) that is suitable for extreme-scale parallelism. All data structures are designed such that the size of the meta data in each distributed processor memory remains bounded independent of the processor number. In all stages of the AMR process, we use only distributed algorithms. No central resources such as a master process or replicated data are employed, so that an unlimited scalability can be achieved. For the dynamic load balancing in particular, we propose to exploit the hierarchical nature of the block-structured domain partitioning by creating a lightweight, temporary copy of the core data structure. This copy acts as a local and fully distributed proxy data structure. It does not contain simulation data, but only provides topological information about the domain partitioning into blocks. Ultimately, this approach enables an inexpensive, local, diffusion-based dynamic load balancing scheme.We demonstrate the excellent performance and the full scalability of our new AMR implementation for two architecturally different petascale supercomputers. Benchmarks on an IBM Blue Gene/Q system with a mesh containing 3.7 trillion unknowns distributed to 458,752 processes confirm the applicability for future extreme-scale parallel machines. The algorithms proposed in this article operate on blocks that result from the domain partitioning. This concept and its realization support the storage of arbitrary data. In consequence, the software framework can be used for different simulation methods, including mesh-based and meshless methods. In this article, we demonstrate fluid simulations based on the lattice Boltzmann method.1.2. Related Work. Software frameworks for block-structured adaptive mesh refinement (SAMR) have been available for the last three decades. Recently, many SAMR codes have been compared in terms of their design, capabilities, and limitations in [22]. All codes covered in this survey can run on large-scale parallel systems, are written in C/C ++ or Fortran, and are publicly available. Moreover, almost all EXTREME-SCALE BLOCK-STRUCTURED ADAPTIVE MESH REFINEMENT 3 these software packages can, among other approaches, make use of space filling curves (SFCs) during load balancing. Some of the SAMR codes are focused on specific applications and methods, while others are more generic and provide the building blocks for a larger variety of computational models. The codes also differ in the extent to which their underlying data structures require the redundant replication and synchronization of meta data among all processes. Meta data that increases with the size of the simulation is often an issue on large-scale parallel systems, and eliminating this need for global meta data replication is a challenge that all SAMR codes are facing.Both BoxLib [9] and Chombo [1], with Chombo being a fork of BoxLib that started in 1998, are general SAMR frameworks that are not tied to a specific application. Both, however, rely on a patc...