Mode-division multiplexing (MDM) can increase the capacity of direct-detection short-reach systems in proportion to the number of modes employed. MDM requires compensation of modal crosstalk at a transmitter or receiver by the multi-input multi-output (MIMO) signal processing. We show that the channel estimation required for the MIMO processing in a basis of modes can be expressed as a phase retrieval problem. We propose three techniques for the estimation: sparse training sequences, convex optimization (CO) and alternating minimization. We demonstrate the superior performance of the CO technique.