Most cyber-physical systems (CPS) encounter a large volume of data with a dynamic nature which is added to the system gradually in real time and not altogether in advance. Therefore, neither traditional supervised (or unsupervised) learning nor typical model-based control approaches can effectively facilitate optimal solutions with performance guarantees. In this article, we provide a theoretical framework that yields optimal control strategies at the intersection of control theory and learning. In the proposed framework, we use the actual CPS, i.e., the "true" CPS that we seek to optimally control online, in parallel with a model of the CPS that we have available. We institute an information state which is the conditional joint probability distribution of the states of the model and the actual CPS given all data available up until each instant of time. We use this information state along with the CPS model to derive offline separated control strategies. Since the strategies are derived offline, the state of the actual CPS is not known, i.e., the model cannot capture the dynamics of the actual CPS due to the complexity of the system, and thus the optimal strategy of the model is parameterized with respect to the state of the actual CPS. However, the control strategy and the process of estimating the information state are separated. Therefore, we can learn the information state of the system online while we operate the model and the actual CPS. We show that after the information state becomes known online through learning, the separated control strategy of the model derived offline is optimal for the actual CPS. We illustrate the proposed framework in a dynamic system consisting of two subsystems with a delayed sharing information structure.