Hippocampal atrophy rate—measured using automated techniques applied to structural MRI scans—is considered a sensitive marker of disease progression in Alzheimer’s disease, frequently used as an outcome measure in clinical trials. Using publicly accessible data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI), we examined one-year hippocampal atrophy rates generated by each of five automated or semi-automated hippocampal segmentation algorithms in patients with Alzheimer’s disease, subjects with mild cognitive impairment, or elderly controls. We examined MRI data from 398 and 62 subjects available at baseline and at one year at MRI field strengths of 1.5T and 3T, respectively. We observed a high rate of hippocampal segmentation failures across all algorithms and diagnostic categories, with only 50.8% of subjects at 1.5T and 58.1% of subjects at 3T passing stringent segmentation quality control. We also found that all algorithms identified several subjects (between 2.94% and 48.68%) across all diagnostic categories showing increases in hippocampal volume over one year. For any given algorithm, hippocampal “growth” could not entirely be explained by excluding patients with flawed hippocampal segmentations, scan-rescan variability, or MRI field strength. Furthermore, different algorithms did not uniformly identify the same subjects as hippocampal “growers”, and showed very poor concordance in estimates of magnitude of hippocampal volume change over time (intraclass correlation coefficient 0.319 at 1.5T and 0.149 at 3T). This precluded a meaningful analysis of whether hippocampal “growth” represents a true biological phenomenon. Taken together, our findings suggest that longitudinal hippocampal volume change should be interpreted with considerable caution as a biomarker.