“…To make the VA user-adaptive, the authors in (Saha et al, 2020c(Saha et al, ,d, 2018, proposed using a sentimentbased reward function for learning a dialogue policy in a task-oriented conversation. The authors of (Saha et al, 2020a) demonstrated how reinforcement learning may be used to generate meaningful responses while training generation frameworks. In (Saha et al, 2020b(Saha et al, , 2021a, the authors show how subtleties in human communication, such as Apart from these, several other work (Wei et al, 2019;Ide and Kawahara, 2021;Huo et al, 2020) that suggests using sentiment and/or emotion as an additional input in generation frameworks either during decoding or as reward to guide the models for generating responses aligned with the user's mood or feelings.…”