Quantitative appraisal of different operating areas and assessment of uncertainty due to reservoir heterogeneities are crucial elements in optimization of production and development strategies in oil sands operations. Although detailed compositional simulators are available for recovery performance evaluation for SAGD, the simulation process is usually deterministic and computationally demanding, and it not quite practical for real-time decision-making and forecasting. Data mining and machine learning algorithms provide efficient modeling alternatives, particularly when the underlying physical relationships between system variables are highly complex, non-linear, and possibly uncertain.In this study, a comprehensive training set encompassing SAGD field data compiled from numerous publicly-available sources is studied. Exploratory data analysis is carried out to interpret and extract relevant attributes describing characteristics associated with reservoir heterogeneities and operating constraints. Because of their ease of implementation and computational efficiency, knowledge-based techniques including artificial neural networks (ANN) are employed to facilitate SAGD production performance prediction. Predicting (input) variables including porosity, net-to-gross ratio, saturation, gross pay, normalized shale barrier thickness and distance to well pair, and initial production rate are formulated. Measures such as cumulative production over discrete time intervals are considered as prediction (output) variables. Data records that are comprised of both input and output variables are assembled; the network is trained using the data set to identify all significant patterns and relationships that exist between the input and the output variables. The model is subsequently validated using a cross-verification procedure, during which records that have been excluded at the training stage are presented to the model. This paper demonstrates that knowledge-based techniques can be implemented in a practical manner to analyze large amount of competitor data efficiently. The approach can be integrated directly into most existing reservoir management routines. It can also be readily updated when new information has become available. Given that robust reservoir management and real-time decision-making are major challenges faced by the industry, the data-driven models presented in this paper has great potential to be applied in other recovery projects such as solvent-aided steam injection.