“…Our study covers seven widely used machine learning models for learning software performance, i.e., Decision Tree (DT) [46] (used by [4,8,25,41]), š-Nearest Neighbours (šNN) [21] (used by [35]), Kernel Ridge Regression (KRR) [52] (used by [35]), Linear Regression (LR) [23] (used by [4,8,49]), Neural Network (NN) [53] (used by [20,26]), Random Forest (RF) [30] (used by [45,50]), and Support Vector Regression (SVR) [17] (used by [4,50]), together with five popular real-world software systems from prior work [15,16,41,44], covering a wide spectrum of characteristics and domains. Naturally, the first research question (RQ) we ask is: RQ1: Is it practical to examine all encoding methods for finding the best one under every system?…”