PurposeThere is a paucity of data available relating to the misconduct of police officers in larger policing agencies, typically resulting in case study approaches and limited insight into the factors associated with serious misconduct. This paper seeks to contribute to the emerging knowledge base on police misconduct through analysis of 28,429 complaints among 3,830 officers in the New York Police Department, between 2000 and 2019.Design/methodology/approachThis study utilized a data set consisting of officer and complainant demographics, and officer complaint records. Machine learning analytics were employed, specifically random forest, to consider which variables were most associated with serious misconduct among officers that committed misconduct. Partial dependence plots were employed among variables identified as important to consider the points at which misconduct was most, and least likely to occur.FindingsPrior instances of serious misconduct were particularly associated with further instances of serious misconduct, while remedial action did not appear to have an impact in preventing further misconduct. Inexperience, both in rank and age, was associated with misconduct. Specific prior complaints, such as minor use of force, did not appear to be particularly associated with instances of serious misconduct. The characteristics of the complainant held more importance than the characteristics of the officer.Originality/valueThe ability to analyze a data set of this size is unusual and important to progressing the knowledge area regarding police misconduct. This study contributes to the growing use of machine learning in understanding the police misconduct environment, and more accurately tailoring misconduct prevention policy and practice.