Aggregating the preferences of a group of experts is a recurring problem in several fields, including engineering design; in a nutshell, each expert formulates an ordinal ranking of a set of alternatives and the resulting rankings should be aggregated into a collective one. Many aggregation models have been proposed in the literature, showing strengths and weaknesses, in line with the implications of Arrow's impossibility theorem. Furthermore, the coherence of the collective ranking with respect to the expert rankings may change depending on: (i) the expert rankings themselves and (ii) the aggregation model adopted. This paper assesses this coherence for a variety of aggregation models, through a recent test based on the Kendall's coefficient of concordance (W), and studies the characteristics of those models that are most likely to achieve higher coherence. Interestingly, the so-called Borda count model often provides best coherence, with some exceptions in the case of collective rankings with ties. The description is supported by practical examples.