Denoising is a permanent topic and there are various denoisers proposed in the fault diagnosis of industrial systems. However, it is still ambiguous to evaluate their performance quantitatively in terms of mean square error (MSE) and further achieve their maximum gains, because it is always infeasible to obtain the MSE metric without real feature signals in the engineering practices. Therefore, leveraging Stein Unbiased Risk Estimator (SURE) theory, a bi-level nested sparse optimization framework (BiNSOF) is proposed to jointly optimize a parameterized sparse denoiser as well as its regularization parameter, further obtaining the near-optimal fault features with a minimum MSE. The inner level of BiNSOF utilizes a 1 regularized sparse denoiser to describe the intrinsic sparse structure of feature information, which can be effectively addressed by popular primal-dual splitting schemes. The core of the outer optimization level is a SURE-based unbiased estimator for MSE, and the minimum MSE search problem is transformed into a quadratic optimization problem which could be fast solved by classic golden section search schemes. The proposed BiNOSP can perfectly approximate the oracle MSE without any real feature information, and further provides a reliable way to obtain the optimal hyper-parameter sets for the maximum performance gains of the sparse denoiser. The computational complexity of the advocated approach is also investigated. Moreover, its feasibility and performances are profoundly evaluated by a set of comprehensive numerical studies. Lastly, two bearing fault detection cases confirm the applicability and superiority of the proposed framework. INDEX TERMS Sparse optimization, Stein unbiased risk estimator (SURE), fault feature detection, primaldual splitting, bi-level nested optimization, adaptive parameter selection.