“…The prediction model was based on conditional random field [16], support vector machines [24], and neural networks [25]. A recent approach is to use recurrent neural networks such as long shortterm memory (LSTM), which can handle long-range context of the input sequence, and it achieved higher accuracy than conventional methods [15,26,19,27,28,29,30]. However, the performance is still low in natural conversations.…”