“…Triplet loss has been used successfully in various approaches for emotion detection in images (Georgescu et al, 2022;Haider et al, 2023), audio data (Ren et al, 2019;Kumar et al, 2021), and multi-modal data. Chudasama et al (2022), for example, propose M2FNet: a multi-modal fusion network for emotion detection in conversations.…”