With the increasing circuit complexity of chip design, wafers are prone to defects in the production process. Since different defects have their causes, post-analysis is important to wafer production. However, the sample size is not balanced, and the effect of classical residual networks is limited. This article uses a Split-Attention network (ResNeSt) with which feature maps will be input into k-base groups, each consisting of r groups. Thanks to the large receptive field, image features can be obtained better. We add channel attention and give higher weight to the effective feature channels to improve network performance. Compared with the classical residual network, the new network has an improvement of 43.06%, 32.73%, and 10.86% in handling Scratch, Near-full, and Local defects, respectively.