Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing this collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington, VA 22202-4302, and This report describes the design of the Impulse architecture and shows how an Impulse memory system can be used in a variety of ways to improve the performance of data-intensive applications. The Impulse design does not require any modification to processor, cache, or bus designs -all novel hardware functionality resides at the memory controller. As a result, Impulse optimizations can be adopted in conventional systems without major system changes. Impulse can be used to: (1) dynamically create superpages cheaply, (2) dynamically recolor physical pages, to perform strided fetches, (3) perform gathers and scatters through indirection vectors, and (4) dynamically gather cache lines from randomly dispersed data. Impulse improved the performance of six DARPA Data Intensive System (DIS) program Stressmarks from 1.25X to 16X (or 470X in the case of in-place CornerTurn, which was unusually well-suited for Impulse). Impulse sped up the twenty-two programs in the benchmark suite by a geometric mean of 2.4X on 2002-class hardware, and 3.3X on 2007-class hardware. In addition to its applicability for data-intensive applications, Impulse can also be used by the OS for dynamic superpage creation, which is useful for arbitrary applications.
NUMBER OF PAGES
Executive SummaryThegoaloftheImpulseprojectwastodevelopa"smart"memorycontroller,andrelatedsoftware,capableofgreatly improving the performance of data-intensive applications. To do so, Impulse adds an optional level of address indirection at the memory controller. Applications can use this level of indirection to remap their data structures in memory to control how their data is accessed and cached, thereby improving cache and bus utilization. Of particular importance is that the Impulse design does not require any modification to processor, cache, or bus designs -all novel hardware functionality resides at the memory controller. As a result, Impulse optimizations can be adopted in conventional systems without major system changes.We describe the design of the Impulse architecture and show how an Impulse memory system can be used in a variety of ways to improve the performance of data-intensive applications. Impulse can be used to dynamically create superpages cheaply, to dynamically recolor physical pages, to perform strided fetches, to perform gathers and scatters through indirection vectors, and to dynamically gather cache lines from randomly dispersed data.Ourperformanceresultsdemo...