Opto-acoustic imaging systems detect acoustic waves produced by optical absorption to visualize molecular contrast in biological tissue. This permits non-invasive vascular assessment of benign and malignant tumors. In this article, we describe a framework to iteratively determine the motion of an opto-acoustic probe during a minimization-based image reconstruction process. The probe emits light and uses an ultrasonic transducer array to acquire data for cross-sectional slices of tissue. To improve visibility, our technique uses multiple 2D slices to perform 3D volumetric reconstruction. Our model includes wavelength-specific optical absorption, position-dependent illumination and a realistic transducer element geometry. We investigate this technique using simulated, experimentally collected, and clinically acquired data. By performing 3D image reconstruction on a digital phantom, we demonstrate estimation of elevational probe motion without external sensors. We compare images of a benign lesion from a clinical breast imaging study and observe significant artifact reduction and contrast-to-background ratio improvement using our technique. The approach has potential to improve opto-acoustic image visibility for assessment of breast cancer or other diseases.