With the rapid development of artificial intelligence technology, commercial robots have gradually entered our daily lives. In order to promote product dissemination, shopping guide robots are a new service options of commerce platforms that use tag recommendation systems to identify users' intentions. A large number of applications combine user historical tagging information with the multi‐round dialogue ability of shopping guide robots to help users efficiently search for and retrieve products of interest. Recently, tensor decomposition methods have become a common approach for modelling entity interaction relationships in tag recommendation systems. However, due to the sparsity of data, these methods only consider low‐order information of entities, making it difficult to capture the higher‐order collaborative signals among entities. Recommendation methods by autoencoders can effectively extract abstract feature representations while they only focus on the two‐dimensional relationship between users and items, ignoring the interaction relationship among users, items and tags in real complex recommendation scenarios. The authors focus on modelling the similarity relationship among entities and propose a method called deep feature fusion tag (DFFT) based on the deep feature fusion of stacked denoising autoencoders. This method can extract high‐order information with different embedding dimensions and fuse them in a unified framework. To extract robust feature representations, the authors inject random noise (mask‐out/drop‐out noise) into the tag information corresponding to users and items to generate corrupted input data, and then utilise autoencoders to encode the interaction relationship among entities. To further obtain the interaction relationship with different dimensions, different encoding layers are stacked and combined to produce a better expanded model which can reinforce each other. Finally, a decoding component is used to reconstruct the original input data. According to the experimental results on two common datasets, the proposed DFFT method outperforms other baselines in terms of the F1@N, NDCG@N and Recall@N evaluation metrics.