Background: This is a case study from an HIV prevention project among young Black men who have sex with men (YBMSM). Individual-level prevention interventions have had limited success among YBMSM, a population that is disproportionately affected by HIV; peer networkbased interventions are a promising alternative. Facebook is an attractive digital platform because it provides the ability to broadly characterize social networks. There are, however, several challenges in using Facebook data for peer interventions, including the large size of Facebook networks, difficulty in assessing appropriate methods to identify candidate PCAs, boundary specification issues, and partial observation of social network data.Objective: We aim to explore methodological challenges in using social Facebook networks to design peer network-based interventions for HIV prevention, and present techniques to overcome these challenges.Methods: Our sample contained 298 "uConnect" study respondents who answered a biobehavioral survey in person and whose Facebook friend lists were downloaded (2013)(2014). The study participants had over 180,000 total Facebook friends who were not involved in the study ("nonrespondents"); we did not observe friendships between these nonrespondents. Given the large number of nonrespondents whose networks were partially observed, a relational boundary was specified to select nonrespondents who were "well connected" to the study respondents and who may be more likely to influence the health behaviors of YBMSM. A stochastic model-based imputation technique, derived from the exponential random graph models (ERGMs), was applied to simulate 100 networks where unobserved friendships between nonrespondents were imputed. To identify PCAs, the eigenvector centrality and keyplayer positive algorithms were used; both algorithms are well-suited to identifying individuals in key network positions for information diffusion. For both algorithms, we assessed: the sensitivity of identified PCAs to the imputation model; the stability of identified PCAs across the imputed networks, and the effect of the boundary specification on the identification of PCAs.Results: All respondents and 79% of nonrespondents selected as PCAs by eigenvector on the imputed networks were also selected as PCAs on the observed networks. For keyplayer, the agreement was much lower: 43% and 33% of respondent and nonrespondent PCAs respectively selected on the imputed networks were also selected on the raw network. Eigenvector also produced a stable set of PCAs across the 100 imputed networks and was much less sensitive to the specified relational boundary.
Conclusion:While we do not have a gold standard that tells us which algorithm produces the most optimal set of PCAs, the lower sensitivity of eigenvector centrality to key assumptions leads us to conclude that it may be preferable. The methods we employed to address the challenges in using Facebook networks may prove timely given the rapidly increasing interest in using online social networks to impr...