The receiver operating characteristic (ROC) surface, as a generalization of the ROC curve, has been widely used to assess the accuracy of a diagnostic test for three categories. A common problem is verification bias, referring to the situation where not all subjects have their true classes verified. In this paper, we consider the problem of estimating the ROC surface under verification bias. We adopt a Bayesian nonparametric approach by directly modeling the underlying distributions of the three categories by Dirichlet process mixture priors. We propose a robust computing algorithm by only imposing a missing at random assumption for the verification process but no assumption on the distributions.The method can also accommodate covariates information in estimating the ROC surface, which can lead to a more comprehensive understanding of the diagnostic accuracy. It can be adapted and hugely simplified in the case where there is no verification bias, and very fast computation is possible through the Bayesian bootstrap process. The proposed method is compared with other commonly used methods by extensive simulations. We find that the proposed method generally outperforms other approaches. Applying the method to two real datasets, the key findings are as follows: (1) human epididymis protein 4 has a slightly better diagnosis ability compared to CA125 in discriminating healthy, early stage, and late stage patients of epithelial ovarian cancer. (2) Serum albumin has a prognostic ability in distinguishing different stages of hepatocellular carcinoma.