Experimental evolution (EE) is a powerful research framework for gaining insights into many biological questions, including the evolution of reproductive systems. We designed a long-term and highly replicated EE project using the nematode C. elegans, with the main aim of investigating the impact of reproductive system on adaptation and diversification under environmental challenge. From the laboratory-adapted strain N2, we derived isogenic lines and introgressed the fog-2(q71) mutation, which changes the reproductive system from nearly exclusive selfing to obligatory outcrossing, independently into 3 of them. This way, we obtained 3 pairs of isogenic ancestral populations differing in reproductive system; from these, we derived replicate EE populations and let them evolve in either novel (increased temperature) or control conditions for over 100 generations. Subsequently, fitness of both EE and ancestral populations was assayed under the increased temperature conditions. Importantly, each population was assayed in 2–4 independent blocks, allowing us to gain insight into the reproducibility of fitness scores. We expected to find upward fitness divergence, compared to ancestors, in populations which had evolved in this treatment, particularly in the outcrossing ones due to the benefits of genetic shuffling. However, our data did not support these predictions. The first major finding was very strong effect of replicate block on populations’ fitness scores. This indicates that despite standardization procedures, some important environmental effects were varying among blocks, and possibly compounded by epigenetic inheritance. Our second key finding was that patterns of EE populations’ divergence from ancestors differed among the ancestral isolines, suggesting that research conclusions derived for any particular genetic background should never be generalized without sampling a wider set of backgrounds. Overall, our results support the calls to pay more attention to biological variability when designing studies and interpreting their results, and to avoid over-generalizations of outcomes obtained for specific genetic and/or environmental conditions.