Contemporary computer architectures utilize wide multi-core processors, accelerators such as GPUs, and clustering of individual computers into complex large-scale systems. These hardware trends are prevalent across computers of all sizes, from the largest supercomputers down to the smallest mobile phones. While these innovations provide high peak computing performance, software developers find it increasingly difficult to effectively target all the processing resources without expert knowledge in parallelization, heterogeneous computing, communication, synchronization, and so on. To ensure that software can keep up with the development of hardware architectures, advanced high-level programming environments and frameworks are needed to bridge the programmability gap. In addition, as the industry is trending towards increased vertical integration of software development stacks, vendor lock-in presents a risk of coupling software projects to proprietary technologies. Combined with problems of technical debt in large-scale software systems, it is clear that portability and open source are desirable properties of high-level parallel programming environments. One example of a programming framework fulfilling the above criteria is SkePU, a framework for high-level data-parallel pattern programming consisting of a compiler toolchain, programming interface, and run-time system.The work presented in this thesis proposes a design of the pattern-centric skeleton programming model of the SkePU framework based on modern C++ with variadic template metaprogramming and state-of-the-art compiler technology. The design enables further flexibility, expressivity, and portability and gives rise to several new performance optimization techniques. The focus lies on a strong set of programming abstractions: providing new and extended patterns, improving the data access locality of existing ones, and using both static and dynamic knowledge about program flow. The work combines novel programming interfaces and implementations with practical evaluation on synthetic and real-world applications. Several contributions are results from international collaborations in application-framework co-design: a single-source parallelization approach of skeleton programs on heterogeneous clusters, an extension mechanism for inserting platform-optimized code variants in high-level skeleton programs, and an integrated abstraction for portable parallel deterministic random number generation. The work places a strong emphasis on programmability aspects to make heterogeneous parallel computing accessible to non-experts, while also providing sufficient performance and interface familiarity for the high-performance computing community. vi First and foremost I want to thank my supervisor at Linköping University (LiU), Professor Christoph Kessler, for continued and tireless efforts, experienced supervision, and caring friendship. Also thanks to my secondary supervisor José Daniel García Sánchez, professor at University Carlos III of Madrid and member of the ISO C+...