“…Besides existing models in ConvLab-2 (Zhu et al, 2020b), we integrate new transformer-based models supporting the unified data format, including SetSUMBT (van Niekerk et al, 2021) andTripPy (Heck et al, 2020) for dialogue state tracking (DST), DDPT (Geishauser et al, 2022) and LAVA (Lubis et al, 2020) for policy learning, SC-GPT for natural language generation (NLG), and SOLOIST with T5 as backbone model (Peng et al, 2022) for end-to-end modeling (End2End). We also integrate multiple powerful data-driven user simulators (US): TUS (Lin et al, 2021a) that outputs user dialogue acts, GenTUS (Lin et al, 2022) that outputs both user dialogue acts and response, and EmoUS (Lin et al, 2023) that additionally outputs emotions.…”