Recommending hashtags for micro-videos is a challenging task due to the following two reasons: 1) micro-video is a unity of multi-modalities, including the visual, acoustic, and textual modalities. Therefore, how to effectively extract features from multi-modalities and utilize them to express the microvideo is of great significance; 2) micro-videos usually include moods and feelings, which may provide crucial cues for recommending proper hashtags. However, most of the existing works have not considered the sentiment of media data for hashtag recommendation. In this paper, the senTiment enhanced multi-mOdal Attentive haShtag recommendaTion (TOAST) model is proposed for micro-video hashtag recommendation. Different from previous hashtag recommendation models, which merely consider content features, sentiment features of modalities are further incorporated in TOAST to improve the recommendation performance of the sentiment hashtags (e.g., #funny, #sad). Specifically, the multi-modal content features and the multi-modal sentiment features are modeled by a content common space learning branch based on self-attention and a sentiment common space learning branch, respectively. Furthermore, the varying importance of the multimodal sentiment and content features are dynamically captured via an attention neural network according to their consistency with the hashtag semantic embedding by an attention neural network. Extensive experiments on a real-world dataset have demonstrated the effectiveness of the proposed method compared with the baseline methods. Meanwhile, the findings from the experiments may provide new insight for future developments of micro-video hashtag recommendation. INDEX TERMS Hashtag recommendation, micro-videos, multiple modalities, self-attention mechanism, sentiment features. I. INTRODUCTION Nowadays, watching micro-videos for leisure and entertainment has gained tremendous user enthusiasm. Taking China as an example, the number of micro-video users has risen from 501 million in 2018 to 627 million in 2019, and is predicted to growth to 722 million in 2020 according to the reports in iiMedia. 1 And micro-video platforms and apps, such as Vine, 2 Snapchat, 3 Kuaishou, 4 and Douyin, 5 etc., have also received unprecedented growth in recent years. How to facilitate users to quickly and accurately find their desired The associate editor coordinating the review of this manuscript and approving it for publication was Yin Zhang .