This work addresses the lack of focus on verification and comparison of existing fatigue damage accumulation and life prediction models on the basis of large and well-documented experiment datasets. Sixty-four constant amplitude, 54 two-level block loading, and 27 three-level block loading valid experiments were performed in order to generate an open-access, high-quality dataset that can be used as a benchmark for existing models. In the future, more experiments of various specimen geometries and loading conditions will be added. The obtained dataset was used for a study comparing five (non)linear fatigue damage and life prediction models. It is shown how the performance of several (non)linear damage models is strongly dependent on the considered material dataset and loading sequence. Therefore, it is important to verify models with a broad set of independent datasets, as many existing models show significant bias to certain datasets.