A well-designed high-performance computing (HPC) course not only presents theoretical parallelism concepts but also includes practical work on parallel systems. Today's machine models are diverse and as a consequence multiple programming models exist. The challenge for HPC course lecturers is to decide what to include and what to exclude, respectively. We have experience in teaching HPC in a multi-paradigm style. The practical course parts include message-passing programming using MPI, directive-based shared memory programming using OpenMP, partitioned global address space based programming using Chapel, and domain-specific programming using a high-level framework. If these models are taught in an isolated mode, students would have problems in assessing the strengths and weaknesses of the approaches presented. We propose a projectbased approach which introduces a specific problem to be solved (in our case a stencil computation) and asks for solutions using the programming approaches introduced. Our course has been successfully taught several times but a major problem has always been checking the individual student solutions, especially to decide which performance results reported one can trust. In order to overcome these deficiencies, we have built a pedagogical tool which enhances the trust in students' work. In the paper we present the infrastructure and tools that make student experiments easily reproducible by lecturers. We introduce a taxonomy for general benchmark experiments, describe the distributed architecture of our development and analysis environment, and, as a case study, discuss performance experiments when solving a stencil problem in multiple programming models.