Hepatitis C virus (HCV) replication in infected patients produces large and diverse viral populations, which give rise to drug-resistant and immune escape variants. Here, we analyzed HCV populations during transmission and diversification in longitudinal and cross-sectional samples using 454/Roche pyrosequencing, in total analyzing 174,185 sequence reads. To sample diversity, four locations in the HCV genome were analyzed, ranging from high diversity (the envelope hypervariable region 1 [HVR1]) to almost no diversity (the 5 untranslated region [UTR]). For three longitudinal samples for which early time points were available, we found that only 1 to 4 viral variants were present, suggesting that productive infection was initiated by a very small number of HCV particles. Sequence diversity accumulated subsequently, with the 5 UTR showing almost no diversification while the envelope HVR1 showed >100 variants in some subjects. Calculation of the transmission probability for only a single variant, taking into account the measured population structure within patients, confirmed initial infection by one or a few viral particles. These findings provide the most detailed sequence-based analysis of HCV transmission bottlenecks to date. The analytical methods described here are broadly applicable to studies of viral diversity using deep sequencing.Hepatitis C virus (HCV) is a positive-strand enveloped RNA virus of the flavivirus family. HCV infects ϳ170 million people worldwide with a high rate of persistence (1, 2) and is a major etiological agent of chronic hepatitis, liver cirrhosis, and hepatocellular carcinoma. The current standard of therapy is the combined use of pegylated alpha interferon (IFN-␣) and ribavirin (9), although there are substantial limitations due to toxicity and resistance profiles (47). Recent development of various small-molecule inhibitors that specifically target HCV offer some promise (13), but challenges still remain because the size and diversity of viral populations promote rapid development of drug resistance (28,42). In an infected individual, serum HCV RNA levels can reach 10 to 100 million IU/ml (40). The viral RNA polymerase is estimated to make 1 error per 10,000 to 100,000 bp copied (22), but the viral genome is only 9,600 bases, resulting in diversification of the viral population, so that most viral genomes differ in sequence from the population consensus (16,20,21). Thus, when antiviral pressure is exerted on a viral population, sequence variants with reduced sensitivity may expand in the presence of the selective pressure (30, 41) and cause resistance (37). Consistent with this, differential sequence diversity in HCV populations has been linked to clinical outcome (7,8).The size and complexity of HCV populations has made their analysis challenging. However, new deep-sequencing and bioinformatics methods are well suited to analyzing this problem. Using the 454/Roche technology, it is possible to generate more than 10 8 bases of DNA sequence in a single 1-day run, albeit in fragments...