Multi-environment trials (METs) of potato breeding clones and cultivars allow to precisely determine their performance across testing sites over years. However, these METs may be affected by the genotype × environment interaction (GEI) as noted in tuber yield. Furthermore, trials are replicated several times to optimize the predictive value of the data collected because knowledge on spatial and temporal variability of testing environments is often lacking. Hence, the objectives of this research were to use components of variance from METs to estimate broad sense heritability (H2) based on best linear unbiased predictors and use these estimates to determine the optimum number of sites, years, and replications for testing potato breeding clones along with cultivars. The data were taken from METs in southern and northern Sweden comprising up to 256 breeding clones and cultivars that underwent testing using a simple lattice design of 10-plant plots across three sites over 2 years. Percentage starch in the tuber flesh had the largest H2 in each testing environment (0.850–0.976) or across testing environments (0.905–0.921). Total tuber weight per plot also exhibited high H2 (0.720–0.919) in each testing environment or across them (0.726–0.852), despite a significant GEI. Reducing sugar content in the tuber flesh had the lowest, but still medium H2 (0.426–0.883 in each testing environment; 0.718–0.818 across testing environments). The H2 estimates were smaller when their variance components were disaggregated by year and site, instead of lumping them as environments. Simulating H2 with genetic, site, year, site × year, genetic × site, genetic × year, genetic × site × year, and residual variance components led to establish that two replicates at each of two sites in 2-year trials will suffice for testing tuber yield, starch and reducing sugars. This article provides a methodology to optimize the number of testing size and years for METs of potato breeding materials, as well as tabulated information for choosing the appropriate number of trials in same target population of environments.