Cooling by the conditional measurement demonstrates a transparent advantage over that by the unconditional counterpart on the average-population-reduction rate. This advantage, however, is blemished by few percentage of the successful probability of finding the detector system in the measured state. In this work, we propose an optimized architecture to cool down a target resonator, which is initialized as a thermal state, using an interpolation of the conditional and unconditional measurement strategies. Analogous to the conditional measurement, an optimal measurementinterval τ u opt for the unconditional (nonselective) measurement is analytically found for the first time, which is inversely proportional to the collective dominant Rabi frequency Ω d as a function of the resonator's population at the end of the last round. A cooling algorithm under the global optimization by the reinforcement learning results in the maximum value for the cooperative cooling performance, an indicator function to quantify the comprehensive cooling efficiency for arbitrary cooling-by-measurement architecture. In particular, the average population of the target resonator under only 16 rounds of measurements can be reduced by over four orders in magnitude with a successful probability about 30%.