Seasonal hypoxia is a recurring threat to ecosystems and fisheries in the Chesapeake Bay. Hypoxia forecasting based on coupled hydrodynamic and biogeochemical models has proven useful for many stakeholders, as these models excel in accounting for the effects of physical forcing on oxygen supply, but may fall short in replicating the more complex biogeochemical processes that govern oxygen consumption. Satellite-derived reflectances could be used to indicate the presence of surface organic matter over the Bay. However, teasing apart the contribution of atmospheric and aquatic constituents from the signal received by the satellite is not straightforward. As a result, it is difficult to derive surface concentrations of organic matter from satellite data in a robust fashion. A potential solution to this complexity is to use deep learning to build end-to-end applications that do not require precise accounting of the satellite signal from atmosphere or water, phytoplankton blooms, or sediment plumes. By training a deep neural network with data from a vast suite of variables that could potentially affect oxygen in the water column, improvement of short-term (daily) hypoxia forecast may be possible. Here we predict oxygen concentrations using inputs that account for both physical and biogeochemical factors. The physical inputs include wind velocity reanalysis information, together with 3D outputs from an estuarine hydrodynamic model, including current velocity, water temperature, and salinity. Satellite-derived spectral reflectance data are used as a surrogate for the biogeochemical factors. These input fields are time series of weekly statistics calculated from daily information, starting 8 weeks before each oxygen observation was collected. To accommodate this input data structure, we adopted a model architecture of long short-term memory networks with 8 time steps. At each time step, a set of convolutional neural networks are used to extract information from the inputs. Ablation and cross validation tests suggest that among all input features, the strongest predictor is the 3D temperature field, with which the new model can outperform the state-of-the-art by ∼20% in terms of median absolute error. Our approach represents a novel application of deep learning to address a complex water management challenge.