How can big data help to understand the migration phenomenon? In this paper, we try to answer this question through an analysis of various phases of migration, comparing traditional and novel data sources and models at each phase. We concentrate on three phases of migration, at each phase describing the state of the art and recent developments and ideas. The first phase includes the journey, and we study migration flows and stocks, providing examples where big data can have an impact. The second phase discusses the stay, i.e. migrant integration in the destination country. We explore various data sets and models that can be used to quantify and understand migrant integration, with the final aim of providing the basis for the construction of a novel multi-level integration index. The last phase is related to the effects of migration on the source countries and the return of migrants.
While sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data
In a digital environment, the term echo chamber refers to an alarming phenomenon in which beliefs are amplified or reinforced by communication repetition inside a closed system and insulated from rebuttal. Up to date, a formal definition, as well as a platform-independent approach for its detection, is still lacking. This paper proposes a general framework to identify echo chambers on online social networks built on top of features they commonly share. Our approach is based on a four-step pipeline that involves (i) the identification of a controversial issue; (ii) the inference of users’ ideology on the controversy; (iii) the construction of users’ debate network; and (iv) the detection of homogeneous meso-scale communities. We further apply our framework in a detailed case study on Reddit, covering the first two and a half years of Donald Trump’s presidency. Our main purpose is to assess the existence of Pro-Trump and Anti-Trump echo chambers among three sociopolitical issues, as well as to analyze their stability and consistency over time. Even if users appear strongly polarized with respect to their ideology, most tend not to insulate themselves in echo chambers. However, the found polarized communities were proven to be definitely stable over time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.