Abstract:The objects of fine-grained image categories(e.g., bird species) are various subclass under different categories. Because the differences between subclass are very subtle and most of them are concentrated in multiple local areas, the task of fine-grained image recognition is very challenging. At the same time, some fine-grained networks tend to focus on a certain region when judging the target category, resulting in the lack of other auxiliary regional features. To this end, Inception V3 is used as the backbone network, and an enhanced and complementary fine-grained image classification network is designed. While adopting the method of reinforcement learning to obtain more detailed fine grain image features, the complementary network can obtain the complementary discriminant area of the target through the method of attention erasure to increase the network's perception of the overall target. Finally, experiments are conducted on CUB-200-2011, FGVC Aircraft and Stanford dogs three open datasets. The experimental results show that the proposed model has better performance.