Welcome to this special issue of the journal Concurrency and Computation: Practice and Experience on Multicore Cache Hierachies -Design and Programmability Issues, which contains three original manuscripts. Caches have been playing an essential role in the performance of single-core systems due to the gap between processor speed and main memory latency. First level caches are strongly restricted by their access time but current processors are able to hide most of their latency using outof-order execution as well as miss overlapping techniques. On the other hand, last levels of the cache memory hierarchy are not so dependable on their access time but on their locality issues. The locality in lower levels is filtered by the upper levels. As requests going down in the memory hierarchy, they require a greater number of cycles to be satisfied, so it becomes more difficult to hide the latency of last-level caches. In multicore systems, their importance is even larger due to the growing number of cores that share the bandwidth that this memory can provide. In an attempt to make a more efficient usage of their caches, the memory hierarchies of many chip multiprocessors present last-level caches, which can be allocated across threads and part of them may be private to a thread while other parts may be shared by multiple threads. Then, caching techniques will continue their evolution during next years in order to tackle the new challenges imposed by multicore platforms and workloads. A clear indicator of the current interest of the research community in new techniques for optimizing the performance and power consumption of multicore cache hierarchies is the organization during last years of specific sessions devoted to these topics at top international conferences on computer architecture and parallel computing.This special issue contributes to this promising field with extended and carefully reviewed versions of selected papers first from the International Workshop on Multicore Cache Hierachies -Design and Programmability Issues, which was held as part of the 10th IEEE International Symposium on Parallel and Distributed Processing with Applications (ISPA 2012) in Madrid (Spain), and second from all the community in the field as the Call for Papers was also open to contributions that were not sent to the mentioned workshop.The first contribution, by O. G. Lorenzo et al.[1], presents a set of three hardware counter (HC)-based tools to characterize memory access of parallel codes in symmetric multiprocessors. This toolkit simplifies accessing and programming HCs, which are included in modern microprocessors. Hardware counters are used to obtain information about memory accesses in a parallel code at very low cost. This information is presented to the user in a friendly way. The first tool can be used to automatically monitor the memory accesses of a system and to analyze a code even if the source is not available. The second tool allows the user to insert in a source code, in a simple and transparent way, the instructions n...