Conversational assistants (CAs) and Task‐oriented ones, in particular, are designed to interact with users in a natural language manner, assisting them in completing specific tasks or providing relevant information. These systems employ advanced natural language understanding (NLU) and dialogue management techniques to comprehend user inputs, infer their intentions, and generate appropriate responses or actions. Over time, the CAs have gradually diversified to today touch various fields such as e‐commerce, healthcare, tourism, fashion, travel, and many other sectors. NLU is fundamental in the natural language processing (NLP) field. Identifying user intents from natural language utterances is a sub‐task of NLU that is crucial for conversational systems. The diversity in user utterances makes intent detection (ID) even a challenging problem. Recently, with the emergence of Deep Neural Networks. New State of the Art (SOA) results have been achieved for different NLP tasks. Recurrent neural networks (RNNs) and Transformer architectures are two major players in those improvements. RNNs have significantly contributed to sequence modelling across various application areas. Conversely, Transformer models represent a newer architecture leveraging attention mechanisms, extensive training data sets, and computational power. This review paper begins with a detailed exploration of RNN and Transformer models. Subsequently, it conducts a comparative analysis of their performance in intent recognition for Task‐oriented (CAs). Finally, it concludes by addressing the main challenges and outlining future research directions.