Long-standing questions in marine viral ecology are centered on understanding how viral assemblages change along gradients in space and time. However, investigating these fundamental ecological questions has been challenging due to incomplete representation of naturally occurring viral diversity in single gene-or morphology-based studies and an inability to identify up to 90% of reads in viral metagenomes (viromes). Although protein clustering techniques provide a significant advance by helping organize this unknown metagenomic sequence space, they typically use only ∼75% of the data and rely on assembly methods not yet tuned for naturally occurring sequence variation. Here, we introduce an annotation-and assembly-free strategy for comparative metagenomics that combines shared k-mer and social network analyses (regression modeling). This robust statistical framework enables visualization of complex sample networks and determination of ecological factors driving community structure. Application to 32 viromes from the Pacific Ocean Virome dataset identified clusters of samples broadly delineated by photic zone and revealed that geographic region, depth, and proximity to shore were significant predictors of community structure. Within subsets of this dataset, depth, season, and oxygen concentration were significant drivers of viral community structure at a single open ocean station, whereas variability along onshore-offshore transects was driven by oxygen concentration in an area with an oxygen minimum zone and not depth or proximity to shore, as might be expected. Together these results demonstrate that this highly scalable approach using complete metagenomic network-based comparisons can both test and generate hypotheses for ecological investigation of viral and microbial communities in nature.virus | microbial ecology | Bayesian network M icroorganisms drive global biogeochemical cycles (1), with abundances and taxonomic composition tuned to spatiotemporally varying environmental conditions (2-5). Viruses then modulate these biogeochemical processes through mortality, horizontal gene transfer, and metabolic reprogramming (6). However, our understanding of how viral communities change in response to biological, physical, and chemical factors and host availability has been limited by technical challenges.Most viruses in the ocean lack both cultivated representatives [85% of 1,100 sequenced phage genomes derive from only 3 of 45 bacterial phyla (7)] and a universally conserved marker gene (8); thus, metagenomics is commonly applied to characterize the ecology and evolution of viral assemblages. Problematically, however, our ability to investigate these assemblages via metagenomics remains limited by the lack of known viruses and viral proteins in biological sequence databases. The first viral metagenome (virome) used thousands of Sanger reads and found that 65% of sequences were unknown [i.e., no database match for reads >600 bp (9)]. Adoption of next-generation sequencing (NGS) technologies then generated hundreds...