To describe an unsupervised three-dimensional cardiac motion estimation network (CarMEN) for deformable motion estimation from two-dimensional cine MR images. Materials and Methods: A function was implemented using CarMEN, a convolutional neural network that takes two three-dimensional input volumes and outputs a motion field. A smoothness constraint was imposed on the field by regularizing the Frobenius norm of its Jacobian matrix. CarMEN was trained and tested with data from 150 cardiac patients who underwent MRI examinations and was validated on synthetic (n = 100) and pediatric (n = 33) datasets. CarMEN was compared to five state-of-the-art nonrigid body registration methods by using several performance metrics, including Dice similarity coefficient (DSC) and end-point error. Results: On the synthetic dataset, CarMEN achieved a median DSC of 0.85, which was higher than all five methods (minimummaximum median [or MMM], 0.67-0.84; P , .001), and a median end-point error of 1.7, which was lower than (MMM, 2.1-2.7; P , .001) or similar to (MMM, 1.6-1.7; P. .05) all other techniques. On the real datasets, CarMEN achieved a median DSC of 0.73 for Automated Cardiac Diagnosis Challenge data, which was higher than (MMM, 0.33; P , .0001) or similar to (MMM, 0.72-0.75; P. .05) all other methods, and a median DSC of 0.77 for pediatric data, which was higher than (MMM, 0.71-0.76; P , .0001) or similar to (MMM, 0.77-0.78; P. .05) all other methods. All P values were derived from pairwise testing. For all other metrics, CarMEN achieved better accuracy on all datasets than all other techniques except for one, which had the worst motion estimation accuracy. Conclusion: The proposed deep learning-based approach for three-dimensional cardiac motion estimation allowed the derivation of a motion model that balances motion characterization and image registration accuracy and achieved motion estimation accuracy comparable to or better than that of several state-of-the-art image registration algorithms.