Passive acoustic detectors are increasingly used for monitoring biodiversity, particularly for echolocating bat species (Microchiroptera). However, identification of calls collected at large scales is hindered by substantial variation within and between species, and the considerable time investment needed to manually identify acoustic data. We use acoustic data from 14 species of echolocating bats, occurring in temperate forests and woodlands of southeastern Australia to build a supervised classification model that identifies species from large acoustic datasets. Acoustic data from hand‐release (39,567) and free‐flying (8851) bat calls were used to build a predictive model, which was then validated using field‐collected calls (149,097) from the same region. We maximized the model fit per species by validating the associated confidence scores against manually identified presence and absence values. This allowed us to model the identification success of each species as a function of the confidence score. From this relationship, we set specific thresholds for accepting species identification, enabling more accurate classification of calls and identification of multiple bat species within a single acoustic recording. Including calls from manually identified free‐flying bats improved overall identification accuracy, including a 60% improvement for bats that navigate in open spaces. Assigning species‐specific thresholds achieved substantial improvements in overall model confidence, with functionally meaningful changes in the identification of species exhibiting considerable acoustic overlap in time and frequency measures. Research into the ecological requirements of species is hampered by problems with identification. Our research illustrates that internal train–test validation overestimates model accuracy particularly for species that were in low abundance or for uncommon species, which are acoustically similar to more common ones. Recognizing this, we set specific thresholds per species below which identifications were not accepted. Our method is particularly relevant in locations with high overlap in species' call parameters, which can result in false negatives in preference for species that are easier to identify because of the common practice of assigning one species per acoustic recording. This research proposes a cautious method to substantially reduce the burden of manual identification of large acoustic datasets.