Background: The stability of a drug or metabolites in biological matrices is an essential part of bioanalytical method validation, but the justification of its sample size (replicates number) is insufficient. The international guidelines differ in recommended sample size to study stability from no recommendation to at least three quality control samples. Testing of three samples may lead to results biased by a single outlier. We aimed to evaluate the optimal sample size for stability testing based on 90% confidence intervals. Methods: We conducted the experimental, retrospective (264 confidence intervals for the stability of nine drugs during regulatory bioanalytical method validation), and theoretical (mathematical) studies. We generated experimental stability data (40 confidence intervals) for two analytes—tramadol and its major metabolite (O-desmethyl-tramadol)—in two concentrations, two storage conditions, and in five sample sizes (n = 3, 4, 5, 6, or 8). Results: The 90% confidence intervals were wider for low than for high concentrations in 18 out of 20 cases. For n = 5 each stability test passed, and the width of the confidence intervals was below 20%. The results of the retrospective study and the theoretical analysis supported the experimental observations that five or six repetitions ensure that confidence intervals fall within 85–115% acceptance criteria. Conclusions: Five repetitions are optimal for the assessment of analyte stability. We hope to initiate discussion and stimulate further research on the sample size for stability testing.