In the presence of test speededness, the parameter estimates of item response theory models can be poorly estimated due to conditional dependencies among items, particularly for end‐of‐test items (i.e., speeded items). This article conducted a systematic comparison of five‐item calibration procedures—a two‐parameter logistic (2PL) model, a one‐dimensional mixture model, a two‐step strategy (a combination of the one‐dimensional mixture and the 2PL), a two‐dimensional mixture model, and a hybrid model‐–by examining how sample size, percentage of speeded examinees, percentage of missing responses, and way of scoring missing responses (incorrect vs. omitted) affect the item parameter estimation in speeded tests. For nonspeeded items, all five procedures showed similar results in recovering item parameters. For speeded items, the one‐dimensional mixture model, the two‐step strategy, and the two‐dimensional mixture model provided largely similar results and performed better than the 2PL model and the hybrid model in calibrating slope parameters. However, those three procedures performed similarly to the hybrid model in estimating intercept parameters. As expected, the 2PL model did not appear to be as accurate as the other models in recovering item parameters, especially when there were large numbers of examinees showing speededness and a high percentage of missing responses with incorrect scoring. Real data analysis further described the similarities and differences between the five procedures.