Coherent optical orthogonal frequency division multiplexing (CO-OFDM) has attracted a lot of interest in optical fiber communications due to its simplified digital signal processing (DSP) units, high spectral-efficiency, flexibility, and tolerance to linear impairments. However, CO-OFDM's high peak-to-average power ratio imposes high vulnerability to fiber-induced non-linearities. DSP-based machine learning has been considered as a promising approach for fiber non-linearity compensation without sacrificing computational complexity. In this paper, we review the existing machine learning approaches for CO-OFDM in a common framework and review the progress in this area with a focus on practical aspects and comparison with benchmark DSP solutions.Future Internet 2019, 11, 2 2 of 20 between the spatial modes, compared to the SMFs. On the other hand, the drive towards higher-order modulation formats, such as 16-QAM, and spectral-efficient techniques, such as orthogonal frequency division multiplexing (OFDM), lead to greater transmission impairments, reducing the maximum distance over which increased capacity can be provided. More specific, denser constellation diagrams render higher-order modulation formats are more susceptible to circularly-symmetric Gaussian noise as generated by Erbium-doped fiber amplifiers (EDFAs) along the transmission link [6]. Even though the launch power per wavelength channel can be increased to improve the signal-to-noise ratio (SNR) at the receiver, transmission is limited by nonlinear distortions due to the Kerr effect, which have a more severe impact on higher-order modulation formats and spectral-efficient modulation schemes [5,7].Moreover, the transmission of more than two signal wavelengths (wavelength-division multiplexing, WDM) through an optical fibre generates four-wave mixing (FWM), a process caused by the power dependence of the refractive index of the optical fibre [3]. FWM is related to fibre nonlinearity and gives rise to new wavelengths which significantly degrade the signal quality especially at high optical powers and when signals are spectrally close to each other. FWM is one of the most dominant nonlinear effects in optical networks and a primary root of the capacity crunch [3]. Since nonlinear noise such as FWM is highly correlated to signals themselves, nonlinearity can be mitigated by performing special treatment of the signals or conducting post-transmission digital signal processing (DSP) on received signals [5,7].On the other hand, coherent optical OFDM (CO-OFDM) [8] has attracted a lot of interest in optical fiber communications due to its simplified DSP units, high spectral-efficiency, flexibility, and tolerance to linear impairments. However, CO-OFDM's high peak-to-average power ratio (PAPR) imposes high vulnerability to fiber-induced nonlinearities [8]. Attempts to combat nonlinearities in CO-OFDM have been performed by deterministic nonlinearity compensators which take advantage of the fact that light scattering within a fibre is a deterministic process. Key te...