Augmented Reality (AR) has proposed several types of interaction techniques such as 3D interactions, natural interactions, tangible interactions, spatial awareness interactions and multimodal interactions. Usually, interaction technique in AR involve unimodal interaction technique that only allows user to interact with AR content by using one modality such as gesture, speech, click, etc. Meanwhile, the combination of more than one modality is called multimodal. Multimodal can contribute to human and computer interaction more efficient and will enhance better user experience. This is because, there are a lot of issues have been found when user use unimodal interaction technique in AR environment such as fat fingers. Recent research has shown that multimodal interface (MMI) has been explored in AR environment and has been applied in various domain. This paper presents an empirical study of some of the key aspects and issues in multimodal interaction augmented reality, touching on the interaction technique and system framework. We reviewed the question of what are the interaction techniques that have been used to perform a multimodal interaction in AR environment and what are the integrated components applied in multimodal interaction AR frameworks. These two questions were used to be analysed in order to find the trends in multimodal field as a main contribution of this paper. We found that gesture, speech and touch are frequently used to manipulate virtual object. Most of the integrated component in MMI AR framework discussed only on the concept of the framework components or the information centred design between the components. Finally, we conclude this paper by providing ideas for future work involving this field.