Dialog act identification plays an important role in understanding conversations. It has been widely applied in many fields such as dialogue systems, automatic machine translation, automatic speech recognition, and especially useful in systems with human-computer natural language dialogue interfaces such as virtual assistants and chatbots. The first step of identifying dialog act is identifying the boundary of the dialog act in utterances. In this paper, we focus on segmenting the utterance according to the dialog act boundaries, i.e. functional segments identification, for Vietnamese utterances. We investigate carefully functional segment identification in two approaches: (1) machine learning approach using maximum entropy (ME) and conditional random fields (CRFs); (2) deep learning approach using bidirectional Long Short-Term Memory (LSTM) with a CRF layer (Bi-LSTM-CRF) on two different conversational datasets: (1) Facebook messages (Message data); (2) transcription from phone conversations (Phone data). To the best of our knowledge, this is the first work that applies deep learning based approach to dialog act segmentation. As the results show, deep learning approach performs appreciably better as to compare with traditional machine learning approaches. Moreover, it is also the first study that tackles dialog act and functional segment identification for Vietnamese.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.