Low-coverage next-generation sequencing methodologies are routinely employed to genotype large populations. Missing data in these populations manifest both as missing markers and markers with incomplete allele recovery. False homozygous calls at heterozygous sites resulting from incomplete allele recovery confound many existing imputation algorithms. These types of systematic errors can be minimized by incorporating depth-of-sequencing read coverage into the imputation algorithm. Accordingly, we developed Low-Coverage Biallelic Impute (LB-Impute) to resolve missing data issues. LB-Impute uses a hidden Markov model that incorporates marker read coverage to determine variable emission probabilities. Robust, highly accurate imputation results were reliably obtained with LB-Impute, even at extremely low (,13) average per-marker coverage. This finding will have implications for the design of genotype imputation algorithms in the future. LB-Impute is publicly available on GitHub at https://github.com/dellaportalaboratory/LB-Impute.KEYWORDS hidden Markov models; imputation; next-generation sequencing; population genetics; plant genomics T HE imputation of missing genotype data has been a key research topic in statistical genetics since well before the advent of next-generation sequencing (NGS) technologies. The goal of many of these algorithms was to reconstruct haplotypes from Sanger or microarray-based genotyping, usually on human populations. Strategies employing the expectation-maximization algorithm (Hawley and Kidd 1995;Long et al. 1995;Qin et al. 2002;Scheet and Stephens 2006), Bayesian inference Stephens and Donnelly 2003), or Markovian methodology (Stephens et al. 2001;Broman et al. 2003;Broman and Sen 2009), local ancestry and gametic phase, could be used to resolve missing markers within a population (Browning and Browning 2011). In these cases, missing genotypes were assigned based on the most likely proximal haplotypes. These computational methods greatly increased the informative content of genotyping information, especially for population studies (Spencer et al. 2009;Cleveland et al. 2011). While these programs were powerful and accurate, they also could be computationally expensive. Further, they assumed that available genotypes were largely correct, which could cause issues with sequencing data sets.The development of programs that focused primarily on the imputation of missing data and haplotype phasing was likely motivated by several factors. Genome-wide association studies could be enhanced by the inference of additional markers using large multipopulation data sets such as the International HapMap Project (International HapMap Consortium et al. 2010). The emergence of the meta-analysis led to a need for algorithms that could merge disparate data sets Howie et al. 2009;Li et al. 2010;Liu et al. 2013;Fuchsberger et al. 2015). These algorithms often employed large haplotype reference panels to improve imputation (Marchini et al. 2007;Browning and Browning 2009;Howie et al. 2009). In bialleli...