©The Author(s) 2018. Published by Oxford University Press. Remotely activated cameras are increasingly used worldwide to investigate the distribution, abundance and behaviour of animals. The number of studies using remote cameras in urban ecosystems, however, is low compared to use in other ecosystems. Currently, the time and effort required to classify images is the main constraint of this monitoring technique. To determine whether, or not, citizen science might help overcome this constraint, we investigated the engagement, accuracy and efficiency of citizen scientists providing crowd-sourced classifications of animal images recorded by remote cameras in Wellington, New Zealand. Classifications from individual citizen scientists were in 84.2% agreement with the classifications of professional ecologists. Aggregating the classifications from three citizen scientists per image, and excluding false triggers and unclassifiable classifications increased their overall accuracy to 97.6%. Classifications by citizen scientists also improved if animal movement was highlighted in the images. The likelihood of citizen scientists correctly classifying images was influenced by their previous accuracy, their self-assessed confidence, and the species reported. Weighting the citizen scientist classifications based on their ability to correctly identify animals reduced from 3 to 2 the number of classifications required per sequence to classify >95% of the photographs containing cats. Citizen science is an accurate and efficient approach for classifying remote camera data from urban areas, where most of the animals are familiar to the participants. We demonstrated how appropriate tools and accounting for the accuracy of citizen scientists, allows project managers to maximise the effort of citizen scientists while ensuring high-quality data.