Genetic diversity is frequently described using heterozygosity, particularly in a conservation context. Often, it is estimated using single nucleotide polymorphisms (SNPs); however, it has been shown that heterozygosity values calculated from SNPs can be biased by both study design and filtering parameters. Though solutions have been proposed to address these issues, our own work has found them to be inadequate in some circumstances. Here, we aimed to improve the reliability and comparability of heterozygosity estimates, specifically by investigating how sample size and missing data thresholds influenced the calculation of autosomal heterozygosity (heterozygosity calculated from across the genome, i.e. fixed and variable sites). We also explored how the standard practice of tri‐ and tetra‐allelic site exclusion could bias heterozygosity estimates and influence eventual conclusions relating to genetic diversity. Across three distinct taxa (a frog, Litoria rubella; a tree, Eucalyptus microcarpa; and a grasshopper, Keyacris scurra), we found heterozygosity estimates to be meaningfully affected by sample size and missing data thresholds, partly due to the exclusion of tri‐ and tetra‐allelic sites. These biases were inconsistent both between species and populations, with more diverse populations tending to have their estimates more severely affected, thus having potential to dramatically alter interpretations of genetic diversity. We propose a modified framework for calculating heterozygosity that reduces bias and improves the utility of heterozygosity as a measure of genetic diversity, whilst also highlighting the need for existing population genetic pipelines to be adjusted such that tri‐ and tetra‐allelic sites be included in calculations.