human epithelial type 2 interphase cellsclassification methods on a very large dataset, Artificial Intelligence In Medicine (2015), http://dx.doi.org/10. 1016/j.artmed.2015.08.001 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
AbstractObjective. This paper presents benchmarking results of human epithelial type 2 (HEp-2) interphase cell image classification methods on a very large dataset. The indirect immunofluorescence method applied on HEp-2 cells has been the gold standard to identify connective tissue diseases such as systemic lupus erythematosus and Sjögren's syndrome. However, the method su↵ers from numerous issues such as being subjective, time consuming and labour intensive. This has been the main motivation for the development of various computer-aided diagnosis systems whose main task is to automatically classify a given cell image into one of the predefined classes.Methods and material. The benchmarking was performed in the form of an international competition held in conjunction with the International Conference of Image Processing in 2013: fourteen teams, composed of practitioners and researchers in this area, took part in the initiative. The system developed by each team was trained and tested on a very large HEp-2 cell dataset comprising over 68,000 images of HEp-2 cell. The dataset contains cells with six di↵erent staining patterns and two levels of fluorescence intensity. For each method we provide a brief description highlighting the design Results. The staining pattern recognition accuracy attained by the methods varies between 47.91% and slightly above 83.65%. However, the difference between the top performing method and the seventh ranked method is only 5%. In the paper, we also study the performance achieved by fusing the best methods, finding that a recognition rate of 85.60% is reached when the top seven methods are employed.Conclusions. We found that highest performance is obtained when using a strong classifier (typically a kernelised support vector machine) in conjunction with features extracted from local statistics. Furthermore, the misclassification profiles of the di↵erent methods highlight that some staining patterns are intrinsically more di cult to recognize. We also noted that performance is strongly a↵ected by the fluorescence intensity level. Thus, low accuracy is to be expected when analyzing low contrasted images.