Six recent Langmuir turbulence parameterization schemes and five traditional schemes are implemented in a common single-column modeling framework and consistently compared. These schemes are tested in scenarios versus matched large eddy simulations, across the globe with realistic forcing (JRA55-do, WAVEWATCH-III simulated waves) and initial conditions (Argo), and under realistic conditions as observed at ocean moorings. Traditional non-Langmuir schemes systematically underpredict large eddy simulation vertical mixing under weak convective forcing, while Langmuir schemes vary in accuracy. Under global, realistic forcing Langmuir schemes produce 6% (−1% to 14% for 90% confidence) or 5.2 m (−0.2 m to 17.4 m for 90% confidence) deeper monthly mean mixed layer depths than their non-Langmuir counterparts, with the greatest differences in extratropical regions, especially the Southern Ocean in austral summer. Discrepancies among Langmuir schemes are large (15% in mixed layer depth standard deviation over the mean): largest under wave-driven turbulence with stabilizing buoyancy forcing, next largest under strongly wave-driven conditions with weak buoyancy forcing, and agreeing during strong convective forcing. Non-Langmuir schemes disagree with each other to a lesser extent, with a similar ordering. Langmuir discrepancies obscure a cross-scheme estimate of the Langmuir effect magnitude under realistic forcing, highlighting limited understanding and numerical deficiencies. Maps of the regions and seasons where the greatest discrepancies occur are provided to guide further studies and observations.