Mobile accessibility is essential to millions of smartphone users withvisual impairment issues in their daily life as the widespread mobileapplications on Google Play and the App Store. Most applicationicons lack the natural language tag to facilitate the reading of visually impaired users who struggle interacting with mobile phones usingscreen readers incorporated in mobile operating systems.COALA isa pilot work solve this issue by generating the textual label fromthe imaging icon automatically. However, most icon datasets haveimbalance distributions in the real-world scenario that only a fewcategories have rich-resource labeled samples and the major rest categories have very limited samples. To solve address the data imbalanceproblem in the icon label generation task, we provide an interconnected dual-language model with mean teacher learning, which learnsa generalized feature representation from divergent data distributions.Extensive experiments demonstrate the superiority of our dual-languagemodel over previous single-language models on different low-resourcedatasets. More experimental results show our method outperforms theCOALA model with a large margin in the label generation evaluations.