Holter systems record the electrocardiogram (ECG), which is used to identify beat families according to their origin and severity. Many systems have been proposed using signal conditioning and machine learning (ML) classification algorithms for beat family recognition. However, the design stage of these systems does not always consider the impact that tuning the intermediate blocks has on the beat family classification and the overall accuracy. We propose to use a new index based on the confusion matrices and bootstrap resampling to summarize the global performance for all family beats, so-called differential beat accuracy (DBA), which is obtained as the total number of beats correctly classified in each class minus the total number of beats incorrectly classified. We addressed the sensitivity of the different subblocks when creating a simple beat family classifier consisting of signal preprocessing blocks and a simple k-Nearest Neighbors classifier. The MIT-BIH Arrhythmia database was used for this purpose, following existing literature on the field. We benchmarked two implementations, one for biclass classification (supraventricular vs. non-supraventricular origin) and another for multiclass beat labeling. The usual preprocessing stages were scrutinized with the DBA to evaluate their impact on the quality of the complete ML system, such as signal detrending and filtering, beat balancing, or inter-beat distance. With the support of the DBA, our methodology was able to detect significant differences in terms of some of the options in the algorithm design. For instance, balancing the number of beats in each class for training significantly improved the classification accuracy of the minority classes at 3.22% for the multiclass dataset but not for the biclass dataset. Also, accuracy improved significantly by about 6% for the biclass regrouping without data normalization, whereas overall accuracy improved significantly by about 7% for the multiclass regrouping with data normalization. In addition, the analysis of the statistical dispersion of confusion matrices showed that this database should be considered with caution when training ML-based family classifiers. We can conclude that the proposed DBA can provide us with statistically principled criteria for designing ML-based classifiers and reducing their bias in strongly unbalanced beat family datasets.