Auscultation is an important diagnostic method for lung diseases. However, it is a subjective modality and requires a high degree of expertise. To overcome this constraint, artificial intelligence models are being developed. However, these models require performance improvements and do not reflect the actual clinical situation. We aimed to develop an improved deep-learning model learning to detect wheezing in children, based on data from real clinical practice. In this prospective study, pediatric pulmonologists recorded and verified respiratory sounds in 76 pediatric patients who visited a university hospital in South Korea. In addition, structured data, such as sex, age, and auscultation location, were collected. Using our dataset, we implemented an optimal model by transforming it based on the convolutional neural network model. Finally, we proposed a model using a 34-layer residual network with the convolutional block attention module for audio data and multilayer perceptron layers for tabular data. The proposed model had an accuracy of 91.2%, area under the curve of 89.1%, precision of 94.4%, recall of 81%, and F1-score of 87.2%. The deep-learning model proposed had a high accuracy for detecting wheeze sounds. This high-performance model will be helpful for the accurate diagnosis of respiratory diseases in actual clinical practice.