How to select a limited number of strong ground motion records (SGMRs) is an important challenge for the seismic collapse capacity assessment of structures. The collapse capacity is considered as the ground motion intensity measure corresponding to the drift-related dynamic instability in the structural system. The goal of this paper is to select, from a general set of SGMRs, a small number of subsets such that each can be used for the reliable prediction of the mean collapse capacity of a particular group of structures, i.e. of single degree-of-freedom systems with a typical behaviour range. In order to achieve this goal, multivariate statistical analysis is first applied, to determine what degree of similarity exists between each selected small subset and the general set of SGMRs. Principal Component analysis is applied to identify the best way to group structures, resulting in a minimum number of SGMRs in a proposed subset. The structures were classified into six groups, and for each group a subset of eight SGMRs has been proposed. The methodology has been validated by analysing a first-modedominated three-storey-reinforced concrete structure by means of the proposed subsets, as well as the general set of SGMRs. The results of this analysis show that the mean seismic collapse capacity can be predicted by the proposed subsets with less dispersion than by the recently developed improved approach, which is based on scaling the response spectra of the records to match the conditional mean spectrum.