We argue in this paper that benchmarking should be complemented by direct measurement of parallelisation overheads when evaluating parallel state-space exploration algorithms. This poses several challenges that so far have not been addressed in the literature: what exactly are those overheads, how can and cannot they be measured, and how should system models be selected in order to expose the causes of parallelisation (in)efficiencies? We discuss and answer these questions based on our experience with parallelising Saturation -a symbolic algorithm for generating state-spaces of asynchronous system models -on a shared-memory architecture. Doing so will hopefully spare newcomers to the growing PDMC community from having to learn these lessons the hard way, as we did over a painful period of almost three years.