Dissolved oxygen concentration (DO) is a crucial factor in maintaining aquatic ecosystem health. In this research, two data-driven modelling (DDM) techniques, multiple linear regression (MLR) and artificial neural networks (ANN), were developed, implemented and compared to predict the DO in the hypolimnetic layer of Seymareh Reservoir in Iran. Low DO in this Reservoir lead to a fish kill event and thus, this reservoir is of interest to water managers in the region. Water quality monitoring data from the Reservoir and an upstream river were used for training the models. In addition, two input variable selection methods, linear correlation analysis and combined neural pathway strength analysis (CNPSA, a nonlinear variable selection method) were developed and compared to determine the most significant inputs to predict hypolimnetic DO. A systematic method to select the optimum architecture of the network is proposed and tested. While these two approaches have been investigated previously, this research focuses on creating a systematic approach to combining two sources of uncertainty of DDM models. Additionally, the performance of CNPSA has not been compared to linear variable selection techniques. This research demonstrates the importance of using systematic input selection and network design for improved DO prediction in a large Reservoir. The performance of the models was quantified using the Nash-Sutcliffe efficiency and root mean squared error, which demonstrated that the ANN approach had better performance compared to the MLR model. The approach demonstrates that by using a systematic input variable selection approach combined with an optimised network architecture, a high performance of DO prediction can be achieved using easily measured upstream input data.