The Adapteva Epiphany many-core architecture comprises a scalable 2D mesh Network-on-Chip (NoC) of low-power RISC cores with minimal uncore functionality. Whereas such a processor offers high computational energy efficiency and parallel scalability, developing effective programming models that address the unique architecture features has presented many challenges. We present here a distributed shared memory (DSM) model supported in software transparently using C++ templated metaprogramming techniques. The approach offers an extremely simple parallel programming model well suited for the architecture. Initial results are presented that demonstrate the approach and provide insight into the efficiency of the programming model and also the ability of the NoC to support a DSM without explicit control over data movement and localization.The development of solutions for performance-portable code remains an open challenge of great interest in computer science as it is applied to high-performance computing. At issue is not the ability to achieve the maximum theoretical performance for every algorithm comprising a given software package, since this will always require heroic efforts and some degree of architecture-specific customization of software. At present, it is proving difficult to achieve even relatively good performance measured against the capabilities of a given parallel architecture. In some cases, non-portable code is required regardless of performance objectives. The Epiphany processor architecture has provided an example of the challenges faced in parallel programmability that must be addressed to support performance-portable code.The Adapteva Epiphany RISC array architecture [1] is a scalable 2D array of low-power RISC cores with minimal un-core functionality supported by an on-chip 2D mesh network for fast inter-core communication. The Epiphany-III architecture is scalable to 4,096 cores and represents an example of an architecture designed for power-efficiency at extreme on-chip core counts. Processors based on this architecture exhibit good performance/power metrics [2] and scalability via 2D mesh network [3][4], but require a suitable programming model to fully exploit the architecture. A 16-core Epiphany-III processor [5] has been integrated into the Parallella mini-computer platform [6] where the RISC array is supported by a dual-core ARM CPU and asymmetric shared-memory access to off-chip global memory. We have recently published results for threaded MPI [7], an OpenSHMEM programming model for Epiphany [8][9], a hybrid programming model [10], and other advances in runtime performance and interoperability [11].RISC array processors, such as those based on the Epiphany architecture, may offer significant computational power efficiency in the near future with requirements in increased core counts, including long-term plans for exascale platforms. The power efficiency of the Epiphany architecture has been specifically identified as both a guide and prospective architecture for such platforms [12]. The Epi...