Existing deep learning-based facial attribute recognition (FAR) methods rely heavily on large-scale labeled training data. Unfortunately, in many real-world applications, only limited labeled data are available, resulting in the performance deterioration of these methods. To address this issue, we propose a novel spatial-semantic patch learning network (SPL-Net), consisting of a multi-branch shared subnetwork (MSS), three auxiliary task subnetworks (AT-S), and an FAR subnetwork, for attribute classification with limited labeled data. Considering the diversity of facial attributes, MSS includes a task-shared branch and four region branches, each of which contains cascaded dual cross attention modules to extract region-specific features. SPL-Net