A difficulty encountered in citizen science projects is the processing and analysis of data collected by participants in order to draw conclusions. The project Sons al Balcó started with the aim of studying the effect of lockdown due to the COVID-19 pandemic on the perception of noise in Catalonia, asking the citizens to evaluate the soundscape from their homes. In one of the activities of the project, citizens collaborated by sending short videos recorded with a mobile phone, together with a subjective questionnaire about the recorded soundscape on their home balcony or window. Following this purpose, the samples coming from citizens should be automatically analyzed in terms of acoustic event detection, in order to compare the objective data in the videos with the subjective impressions collected in the questionnaires. As a first step towards automatic acoustic event classification, this paper details and compares the acoustic samples of the two collecting campaigns of the project. While the 2020 campaign obtained 365 videos, the 2021 campaign obtained 237. Later, a convolutional neural network has been trained to automatically detect and classify acoustic events even if they occur simultaneously. The findings indicate that the detection rates of different categories are not uniform, with the prevalence percentage of an event in the dataset and its foreground-to-background ratio being