The emergence of social media has largely eased the way people receive information and participate in public discussions. However, in countries with strict regulations on discussions in the public space, social media is no exception. To limit the degree of dissent or inhibit the spread of "harmful" information, a common approach is to impose censorship on social media. In this paper, we focus on a study of censorship on Weibo, the counterpart of Twitter in China. Specifically, we 1) create a web-scraping pipeline and collect a large dataset solely focus on the reposts from Weibo; 2) discover the characteristics of users whose reposts contain censored information, in terms of gender, location, device, and account type; and 3) conduct a thematic analysis by extracting and analyzing topic information. Note that although the original posts are no longer visible, we can use comments user wrote when reposting the original post to infer the topic of the original content. We find that such efforts can recover the discussions around social events that triggered massive discussions but were later muted. Further, we show the variations of inferred topics across different user groups and time frames.
CCS CONCEPTS• Social and professional topics → User characteristics; • Applied computing → Sociology.