We present a new algorithm to track the amplitude and phase of rotating MHD modes in tokamak plasmas using high speed imaging cameras and deep learning. This algorithm uses a convolutional neural network (CNN) to predict the amplitudes of the n=1 sine and cosine mode components using solely optical measurements from one or more cameras. The model was trained and tested on an experimental dataset consisting of camera frame images and magnetic-based mode measurements from the High Beta Tokamak – Extended Pulse (HBT-EP) device, and it outperformed other, more conventional, algorithms using identical image inputs. The effect of different input datastreams on the accuracy of the model’s predictions is also explored, including using a temporal frame stack or images from two cameras viewing different toroidal regions.