The growing gap between sustained and peak performance for full-scale complex scientific applications on conventional supercomputers is a major concern in high performance computing (HPC). The problem is expected to be exacerbated by the end of this decade, as mission-critical applications will have computational requirements that are at least two orders of magnitude larger than current levels. In order to continuously increase raw computational power and at the same time substantially reap its benefits, major strides are necessary in hardware architecture, software infrastructure, and application development. The first step toward this goal is the accurate assessment of existing and emerging HPC systems across a comprehensive set of scientific algorithms. In addition, high-fidelity performance modeling is required to understand and predict the complex interactions among hardware, software, and applications, and thereby influence future design trade-offs. This survey article discusses recent performance evaluations of state-of-the-art ultra-scale systems for a diverse set of scientific applications, including scalable compact synthetic benchmarks and architectural probes. In addition, performance models and program characterizations from key scientific areas are described.