Cone-beam CT (CBCT) for musculoskeletal imaging would benefit from a method to reduce the effects of involuntary patient motion. In particular, the continuing improvement in spatial resolution of CBCT may enable tasks such as quantitative assessment of bone microarchitecture (0.1 mm – 0.2 mm detail size), where even subtle, sub-mm motion blur might be detrimental. We propose a purely image based motion compensation method that requires no fiducials, tracking hardware or prior images. A statistical optimization algorithm (CMA-ES) is used to estimate a motion trajectory that optimizes an objective function consisting of an image sharpness criterion augmented by a regularization term that encourages smooth motion trajectories. The objective function is evaluated using a volume of interest (VOI, e.g. a single bone and surrounding area) where the motion can be assumed to be rigid. More complex motions can be addressed by using multiple VOIs. Gradient variance was found to be a suitable sharpness metric for this application. The performance of the compensation algorithm was evaluated in simulated and experimental CBCT data, and in a clinical dataset. Motion-induced artifacts and blurring were significantly reduced across a broad range of motion amplitudes, from 0.5 mm to 10 mm. Structure Similarity Index (SSIM) against a static volume was used in the simulation studies to quantify the performance of the motion compensation. In studies with translational motion, the SSIM improved from 0.86 before compensation to 0.97 after compensation for 0.5 mm motion, from 0.8 to 0.94 for 2 mm motion and from 0.52 to 0.87 for 10 mm motion (~70% increase). Similar reduction of artifacts was observed in a benchtop experiment with controlled translational motion of an anthropomorphic hand phantom, where SSIM (against a reconstruction of a static phantom) improved from 0.3 to 0.8 for 10 mm motion. Application to a clinical dataset of a lower extremity showed dramatic reduction of streaks and improvement in delineation of tissue boundaries and trabecular structures throughout the whole volume. The proposed method will support new applications of extremity CBCT in areas where patient motion may not be sufficiently managed by immobilization, such as imaging under load and quantitative assessment of subchondral bone architecture.