Abstract. The high performance potential of an FPGA is not fully exploited if a design suffers a memory bottleneck. Therefore, a memory hierarchy is needed to reuse data in on-chip buffer memories and minimize the number of accesses to off-chip memory. Buffer memories not only hide the external memory latency, but can also be used to remap data and augment the on-chip bandwidth through parallel access of multiple buffers. This paper discusses the differences and similarities of memory hierarchies on processor-and on FPGA-based systems and presents a step-by-step methodology to construct a memory hierarchy on an FPGA.