“…The qualitative analysis of the results points out to five general topics: the type of environment (real life or online), the online data contamination, the type of platform, the timeframes commonly selected, and the language dependence. Delving into the type of environment, the 11 papers selected for this scoping review can be split between in-real-life (IRL) speech (Mathew Gentzkow & Shapiro, 2010;Matthew Gentzkow et al, 2019;Sloman et al, 2021) and online discourse (Cantini et al, 2020;Esteve Del Valle et al, 2021;Jiang et al, 2020;Kursuncu et al, 2019;Makrehchi, 2016;Serrano-Contreras et al, 2020). Four papers use US congress speech, being the congress transcripts the majority of IRL data text, and three works use Twitter data.…”