This work presents the results of a benchmark study on aero-servo-hydro-elastic codes for offshore wind turbine dynamic simulation. The codes verified herein account for the coupled dynamic systems including the wind inflow, aerodynamics, elasticity and controls of the turbine, along with the incident waves, sea current, hydrodynamics and foundation dynamics of the support structure. A large set of time series simulation results such as turbine operational characteristics, external conditions, and load and displacement outputs was compared and interpreted. Load cases were defined and run with increasing complexity to trace back differences in simulation results to the underlying error sources. This led to a deeper understanding of the underlying physical systems. In four subsequent phases-dealing with a 5-MW turbine on a monopile with a fixed foundation, a monopile with a flexible foundation, a tripod and a floating spar buoy-the latest support structure developments in the offshore wind energy industry are covered, and an adaptation of the codes to those developments was initiated. The comparisons, in general, agreed quite well. Differences existed among the predictions were traced back to differences in the model fidelity, aerodynamic implementation, hydrodynamic load discretization and numerical difficulties within the codes. The comparisons resulted in a more thorough understanding of the modeling techniques and better knowledge of when various approximations are not valid. More importantly, the lessons learned from this exercise have been used to further develop and improve the codes of the participants and increase the confidence in the codes' accuracy and the correctness of the results, hence improving the standard of offshore wind turbine modeling and simulation. One purpose of this paper is to summarize the lessons learned and present results that code developers can compare to. The set of benchmark load cases defined and simulated during the course of this project-the raw data for this paper-is available to the offshore wind turbine simulation community and is already being used for testing newly developed software tools. Despite that no measurements are included, the large number of participants and the-in general-very fine level of agreement indicate high trustworthy results within the physical assumptions of the codes and the simulation cases chosen. Other cases, such as large prebend flexible blades, large wind shear, large yaw error or transient maneuvers, may not show the same level of agreement. These cases were deliberately left out because the focus is on the specific offshore application. Further on, this benchmark study includes participating codes and organizations by name (contrary to several previous benchmark studies) that gives the reader a chance to find results from one particular code of interest