To understand the state and trends in biodiversity beyond the scope of monitoring programs, biodiversity indicators must be comparable across inventories. Species richness (SR) is one of the most widely used biodiversity indicators. However, as SR increases with the size of the area sampled, inventories using different plot sizes are hardly comparable. This study aims at producing a methodological framework that enables SR comparisons across plot‐based inventories with differing plot sizes. We used National Forest Inventory (NFI) data from Norway, Slovakia, Spain, and Switzerland to build sample‐based rarefaction curves by randomly incrementally aggregating plots, representing the relationship between SR and sampled area. As aggregated plots can be far apart and subject to different environmental conditions, we estimated the amount of environmental heterogeneity (EH) introduced in the aggregation process. By correcting for this EH, we produced adjusted rarefaction curves mimicking the sampling of environmentally homogeneous forest stands, thus reducing the effect of plot size and enabling reliable SR comparisons between inventories. Models were built using the Conway–Maxell–Poisson distribution to account for the underdispersed SR data. Our method successfully corrected for the EH introduced during the aggregation process in all countries, with better performances in Norway and Switzerland. We further found that SR comparisons across countries based on the country‐specific NFI plot sizes are misleading, and that our approach offers an opportunity to harmonize pan‐European SR monitoring. Our method provides reliable and comparable SR estimates for inventories that use different plot sizes. Our approach can be applied to any plot‐based inventory and count data other than SR, thus allowing a more comprehensive assessment of biodiversity across various scales and ecosystems.