This work's contributions include three innovative concepts, an improved model, two‐stage Lagrange principle, and minimum‐energy scaling optimisation, for quantisation audio watermarking in the wavelet domain. First, discrete wavelet transform (DWT) multi‐coefficients quantisation, composed of arbitrary scaling on the lowest DWT coefficients, and the group‐based signal‐to‐noise ratio (SNR) of these coefficients is connected in a model. Then, the two‐stage Lagrange principle and minimum‐energy approach play two essential roles to obtain the optimal scaling factors. With the proposed scheme, the best fidelity and robustness of embedded audio can be attained and the perceptual evaluation of audio quality (PEAQ) test with an illustration of the relationship between SNR and PEAQ is also performed as well. Simulation results show that each watermarked audio by the proposed method attains a high SNR, good PEAQ, and a low bit error rate (BER). The SNR of most watermarked audios in their method is above 35 or even above 40 and the corresponding subjective difference grade of PEAQ is close to 0. In terms of comparing BER, most of their BER is as low as 2% or less indicating better robustness against many attacks, such as re‐sampling, amplitude scaling, and mp3 compression.