A growing amount of research use pre‐trained language models to address few/zero‐shot text classification problems. Most of these studies neglect the semantic information hidden implicitly beneath the natural language names of class labels and develop a meta learner from the input texts solely. In this work, we demonstrate how label information can be utilized to extract enhanced feature representation of the input text from a Transformer‐based pre‐trained language model such as AraBERT. In addition, how this approach can improve performance when the data resources are scarce like in the Arabic language and the input text is short with little semantic information as is the case using tweets. The work also applies zero‐shot text classification to predict new classes with no training examples across different domains including sarcasm detection and sentiment analysis using the information in the last layer of a trained classifier in a transfer learning setting. Experiments show that our approach has a better performance for the few‐shot sentiment classification compared to baseline models and models trained without augmenting label information. Moreover, the zero‐shot implementation achieved an accuracy up to 0.874 in Arabic sarcasm detection from a model trained on a sentiment analysis task.