The statistical framework of rarefaction curves and asymptotic estimators allows for an effective standardization of biodiversity measures. However, most statistical analyses still consist of point comparisons of diversity estimators for a particular sampling level. We introduce new randomization methods that incorporate sampling variability encompassing the entire length of the rarefaction curve and allow for statistical comparison of i !2 individual-based, sample-based, or coverage-based rarefaction curves. These methods distinguish between two distinct null hypotheses: the ecological null hypothesis (H 0eco ) and the biogeographical null hypothesis (H 0biog ).H 0eco states that the i samples were drawn from a single assemblage, and any differences among them in species richness, composition, or relative abundance reflect only sampling effects. H 0biog states that the i samples were drawn from assemblages that differ in their species composition but share similar species richness and species abundance distributions. To test H 0eco , we created a composite rarefaction curve by summing the abundances of all species from the i samples. We then calculated a test statistic Z eco , the (cumulative) summed areas of difference between each of the i individual curves and the composite curve. For H 0biog , the test statistic Z biog was calculated by summing the area of difference between all possible pairs of the i individual curves. Bootstrap sampling from the composite curve (H 0eco ) or random sampling from different simulated assemblages using alternative abundance distributions (H 0biog ) was used to create the null distribution of Z, and to provide a frequentist test of Z j H 0 . Rejection of H 0eco does not pinpoint whether the samples differ in species richness, species composition, and/or relative abundance.In benchmark comparisons, both tests performed satisfactorily against artificial data sets randomly drawn from a single assemblage (low Type I error). In benchmark comparisons with different species abundance distributions and richness, the tests had adequate power to detect differences among curves (low Type II error), although power diminished at small sample sizes and for small differences among underlying species rank abundances.