Audio recording in classrooms is a common practice in educational research, with applications ranging from detecting classroom activities to analyzing student behavior. Previous research has employed neural networks for classroom activity detection and speaker role identification. However, these recordings are often affected by background noise that can hinder further analysis, and the literature has only sought to identify noise with general filters and not specifically designed for classrooms. Although the use of high-end microphones and environmental monitoring can mitigate this problem, these solutions can be costly and potentially disruptive to the natural classroom environment. In this context, we propose the development of a novel neural network model that specifically detects and filters out problematic audio sections in classroom recordings. This model is particularly effective in reducing transcription errors, achieving up to a 96% success rate in filtering out segments that could lead to incorrect automated transcriptions. The novelty of our work lies in its targeted approach for low-budget, aurally complex environments like classrooms, where multiple speakers are present. By allowing the use of lower-quality recordings without compromising analysis capability, our model facilitates data collection in natural educational settings and reduces the dependency on expensive recording equipment. This advancement not only demonstrates the practical application of specialized neural network filters in challenging acoustic environments but also opens new avenues for enhancing audio analysis in educational research and beyond.