Van der Linden's (2007, Psychometrika, 72, 287) hierarchical model for responses and response times in tests has numerous applications in psychological assessment. The success of these applications requires the parameters of the model to have been estimated without bias. The data used for model fitting, however, are often contaminated, for example, by rapid guesses or lapses of attention. This distorts the parameter estimates. In the present paper, a novel estimation approach is proposed that is robust against contamination. The approach consists of two steps. In the first step, the response time model is fitted on the basis of a robust estimate of the covariance matrix. In the second step, the item response model is extended to a mixture model, which allows for a proportion of irregular responses in the data. The parameters of the mixture model are then estimated with a modified marginal maximum likelihood estimator. The modified marginal maximum likelihood estimator downweights responses of test-takers with unusual response time patterns. As a result, the estimator is resistant to several forms of data contamination. The robustness of the approach is investigated in a simulation study. An application of the estimator is demonstrated with real data.