“…The commonly adopted approach requires each target compressed system with the desired size to be individually constructed, for example, in [14,15,17] for Conformer models, and similarly for SSL foundation models such as DistilHuBERT [23], FitHuBERT [24], DPHuBERT [31], PARP [20], and LightHuBERT [30] (no more than 3 systems of varying complexity were built). 2) limited scope of system complexity attributes covering only a small subset of architecture hyper-parameters based on either network depth or width alone [8,9,11,35,36], or both [10,13,14,37], while leaving out the task of low-bit quantization, or vice versa [15][16][17][18][19][32][33][34]. This is particularly the case with the recent HuBERT model distillation research [23][24][25][28][29][30][31] that are focused on architectural compression alone.…”