We examine the dimensions of Internet use based on a representative sample of the population of the United Kingdom, making three important contributions. First, we clarify theoretical dimensions of Internet use that have been conflated in prior work. We argue that the property space of Internet use has three main dimensions: amount of use, variety of different uses, and types of use. Second, the Oxford Internet Survey 2011 dataset contains a comprehensive set of 48 activities ranging from email to online banking to gambling. Using principal components analysis we identify ten distinctive types of Internet activities. This is the first typology of Internet uses to be based on such a comprehensive set of activities. We use regression analyses to validate the three dimensions and to identify the characteristics of the users of each type. Each type has a distinctive and different kind of user. The Internet is an extremely diverse medium. We cannot discuss "Internet use" as a general phenomenon; instead researchers must specify what kind of use they examine.
The use of online surveys has grown rapidly in social science and policy research, surpassing more established methods. We argue that a better understanding is needed, especially of the strengths and weaknesses of non-probability online surveys, which can be conducted relatively quickly and cheaply. We describe two common approaches to non-probability online surveysriver and panel sampling-and theorize their inherent selection biases: namely, topical selfselection and economic self-selection. We conduct an empirical comparison of two river samples (Facebook and web-based sample) and one panel sample (from a major survey research company) with benchmark data grounded in a comprehensive population registry. The river samples diverge from the benchmark on demographic variables and yield much higher frequencies on non-demographic variables, even after demographic adjustments; we attribute this to topical self-selection. The panel sample is closer to the benchmark. When examining the characteristics of a non-demographic subpopulation, we detect no differences between the river and panel samples. We conclude that non-probability online surveys do not replace probability surveys, but augment the researcher's toolkit with new digital practices, such as exploratory studies of small and emerging non-demographic subpopulations.
Hundreds of papers have been published using Twitter data, but few previous papers report the digital divide among Twitter users. British Twitter users are younger, wealthier, and better educated than other Internet users, who in turn are younger, wealthier, and better educated than the off-line British population. American Twitter users are also younger and wealthier than the rest of the population, but they are not better educated. Twitter users are disproportionately members of elites in both countries. Twitter users also differ from other groups in their online activities and their attitudes. These biases and differences have important implications for research based on Twitter data. The unrepresentative characteristics of Twitter users suggest that Twitter data are not suitable for research where representativeness is important, such as forecasting elections or gaining insight into attitudes, sentiments, or activities of large populations. In general, Twitter data seem to be more suitable for corporate use than for social science research.
Sociological studies show that Internet access, skills, uses and outcomes vary between different population segments. However, we lack differentiated statistical evidence of the social characteristics of users of distinct social media platforms. We address this issue using a representative survey of Great Britain and investigate the social characteristics of six major social media platforms. We find that age and socioeconomic status are driving forces of several-but not all-of these platforms. The findings suggest that no social media platform is representative of the general population. The unrepresentativeness has major implications for research that uses social media as a data source. Social media data cannot be used to generalize to any population other than themselves.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.