In the past couple of years, there is a proliferation in the use of machine learning approaches to represent subgrid scale processes in geophysical flows with an aim to improve the forecasting capability and to accelerate numerical simulations of these flows. Despite its success for different types of flow, the online deployment of a data-driven closure model can cause instabilities and biases in modeling the overall effect of subgrid scale processes, which in turn leads to inaccurate prediction. To tackle this issue, we exploit the data assimilation technique to correct the physics-based model coupled with the neural network as a surrogate for unresolved flow dynamics in multiscale systems. In particular, we use a set of neural network architectures to learn the correlation between resolved flow variables and the parameterizations of unresolved flow dynamics and formulate a data assimilation approach to correct the hybrid model during their online deployment. We illustrate our framework in a application of the multiscale Lorenz 96 system for which the parameterization model for unresolved scales is exactly known. Our analysis, therefore, comprises a predictive dynamical core empowered by (i) a data-driven closure model for subgrid scale processes, (ii) a data assimilation approach for forecast error correction, and (iii) both data-driven closure and data assimilation procedures. We show significant improvement in the long-term perdition of the underlying chaotic dynamics with our framework compared to using only neural network parameterizations for future prediction. Moreover, we demonstrate that these data-driven parameterization models can handle the non-Gaussian statistics of subgrid scale processes, and effectively improve the accuracy of outer data assimilation workflow loops in a modular non-intrusive way.