Hepatitis C virus (HCV) is a leading worldwide cause of liver disease. Here, we use a new model of HCV spread to investigate the epidemic behavior of the virus and to estimate its basic reproductive number from gene sequence data. We find significant differences in epidemic behavior among HCV subtypes and suggest that these differences are largely the result of subtype-specific transmission patterns. Our model builds a bridge between the disciplines of population genetics and mathematical epidemiology by using pathogen gene sequences to infer the population dynamic history of an infectious disease.An estimated 170 million people worldwide are at risk of liver cirrhosis and liver cancer due to chronic infection with HCV (1). The virus is responsible for 10,000 deaths per year in the United States, and this rate is expected to increase substantially in the next two decades (2). HCV is a rapidly evolving single-stranded positivesense RNA virus that exhibits enormous genetic diversity. It is classified into six types (labeled 1 through 6) and numerous subtypes (labeled 1a, 1b, etc.), which differ in diversity, geographical distribution, and transmission route (3). Subtypes appear to differ in treatment response, although their role in variation of disease progression is unclear (2, 4 ). Any successful HCV vaccination or control strategy, therefore, requires an understanding of the nature and variability of epidemic behavior among subtypes.HCV was first isolated in 1989, and knowledge of its long-term epidemiology before that date is limited. Highly divergent strains have been found in restricted geographic areas such as West Africa and Southeast Asia, suggesting a long period of infection in these regions. In contrast, several globally prevalent subtypes are much less divergent, indicating a recent worldwide spread of these strains (5-7).We investigate HCV epidemiology using coalescent theory, a population genetic model that describes how the demographic history of a population determines the ancestral relationships of individuals sampled from it (8, 9). Phylogenies reconstructed from contemporary HCV gene sequences contain information about past population dynamics and can, therefore, be used to infer viral epidemic behavior (10). We also demonstrate one way in which the fundamental epidemiological quantity R 0 (the basic reproductive number of a pathogen) can be estimated from gene sequences. R 0 represents the average number of secondary infections generated by one primary case in a susceptible population and can be used to estimate the level of immunization or behavioral change required to control an epidemic (11).The framework of coalescent theory allows us to estimate N(t), a continuous function that represents the effective number of infections at time t. Time t is zero at the present and increases into the past, hence N(0) is the effective number of infections at the present. N(t) can be considered as the inbreeding effective population size of the viral epidemic (12). Previous viral coalescent studies have ...