Objective: To evaluate the performance of a deep learning model for hippocampal sclerosis classification on the clinical dataset and suggest plausible visual interpretation for the model prediction. Methods: T2-weighted oblique coronal images of the brain MRI epilepsy protocol performed on patients were used. The training set included 320 participants with 160 no, 100 left and 60 right hippocampal sclerosis, and cross-validation was implemented. The test set consisted of 302 participants with 252 no, 25 left and 25 right hippocampal sclerosis. As the test set was imbalanced, we took an average of the accuracy achieved within each group to measure a balanced accuracy for multiclass and binary classifications. The dataset was composed to include not only healthy participants but also participants with abnormalities besides hippocampal sclerosis in the control group. We visualized the reasons for the model prediction using the layer-wise relevance propagation method. Results: When evaluated on the validation of the training set, we achieved multiclass and binary classification accuracy of 87.5% and 88.8% from the voting ensemble of six models. Evaluated on the test sets, we achieved multiclass and binary classification accuracy of 91.5% and 89.76%. The distinctly sparse visual interpretations were provided for each individual participant and group to suggest the contribution of each input voxel to the prediction on the MRI. Significance: The current interpretable deep learning-based model is promising for adapting effectively to clinical settings by utilizing commonly used data, such as MRI, with realistic abnormalities faced by neurologists to support the diagnosis of hippocampal sclerosis with plausible visual interpretation.