“…Additionally, audio datasets collected by the authors themselves in real environments or situations were observed. Examples include the Chime-Home [ 7 ], a dataset of gunshot audio [ 8 ], one focused on motor sounds [ 9 ], and some specific resources for spoken tasks, such as AudioMNIST [ 10 ] and STOP [ 11 ]. Also, there are audio datasets created through cutting, modifications, and transformations applied to existing datasets, such as SARdB [ 12 ] for audio scenes and Shrutilipi [ 13 ] for automatic speech recognition.…”