Most of the methods used today for handling local stress constraints in topology optimization, fail to directly address the non-self-adjointness of the stress-constrained topology optimization problem. This in turn could drastically raise the computational cost for an already large-scale problem. These problems involve both the equilibrium equations resulting from finite element analysis (FEA) in each iteration, as well as the adjoint equations from the sensitivity analysis of the stress constraints. In this work, we present a paradigm for large-scale stress-constrained topology optimization problems, where we build a multi-grid approach using an on-the-fly Reduced Order Model (ROM) and the p-norm aggregation function, in which the discrete reduced-order basis functions (modes) are adaptively constructed for adjoint problems. In addition to reducing the computational savings due to the ROM, we also address the computational cost of the ROM learning and updating phases. Both reduced-order bases are enriched according to the residual threshold of the corresponding linear systems, and the grid resolution is adaptively selected based on the relative error in approximating the objective function and constraint values during the iteration. The tests on 2D and 3D benchmark problems demonstrate improved performance with acceptable objective and constraint violation errors. Finally, we thoroughly investigate the influence of relevant stress constraint parameters such as the p norm factor, stress penalty factor, and the allowable stress value.