Purpose: To analyze the most recent results of the Imaging and Radiation Oncology Core Houston Quality Assurance Center's (IROC-H) anthropomorphic head and neck (H&N) phantom to determine the nature of failing irradiations and the feasibility of altering credentialing criteria. Methods: IROC-H's H&N phantom, used for intensity-modulated radiation therapy credentialing for National Cancer Institute-sponsored clinical trials, requires that an institution's treatment plan agrees within ±7% of measured thermoluminescent dosimeter (TLD) doses; it also requires that ≥85% of pixels pass ±4 mm distance to agreement (7%/4 mm gamma analysis for film). The authors re-evaluated 156 phantom irradiations (November 1, 2014-October 31, 2015 according to the following tighter criteria: (1) 5% TLD and 5%/4 mm, (2) 5% TLD and 5%/3 mm, (3) 4% TLD and 4%/4 mm, and (4) 3% TLD and 3%/3 mm. Failure rates were evaluated with respect to individual film and TLD performance by location in the phantom. Overall poor phantom results were characterized qualitatively as systematic errors (correct shape and position but wrong magnitude of dose), setup errors/positional shifts, global but nonsystematic errors, and errors affecting only a local region. Results: The pass rate for these phantoms using current criteria was 90%. Substituting criteria 1-4 reduced the overall pass rate to 77%, 70%, 63%, and 37%, respectively. Statistical analyses indicated that the probability of noise-induced TLD failure, even at the 5% criterion, was <0.5%. Phantom failures were generally identified by TLD (≥66% failed TLD, whereas ≥55% failed film), with most failures occurring in the primary planning target volume (≥77% of cases). Results failing current criteria or criteria 1 were primarily diagnosed as systematic >58% of the time (11/16 and 21/36 cases, respectively), with a greater extent due to underdosing. Setup/positioning errors were seen in 11%-13% of all failing cases (2/16 and 4/36 cases, respectively). Local errors (8/36 cases) could only be demonstrated at criteria 1. Only three cases of global errors were identified in these analyses. For current criteria and criteria 1, irradiations that failed from film only were overwhelmingly associated with phantom shifts/setup errors (≥80% of cases). Conclusions: This study highlighted that the majority of phantom failures are the result of systematic dosimetric discrepancies between the treatment planning system and the delivered dose. Further work is necessary to diagnose and resolve such dosimetric inaccuracy. In addition, the authors found that 5% TLD and 5%/4 mm gamma criteria may be both practically and theoretically achievable as an alternative to current criteria. C