In health care institutions, medical specialty information may be lacking or inaccurate, in part because there is no official code to express such specialties. Diagnosis histories offer information on which medical specialties may exist in practice, regardless of whether they have official codes. We refer to such specialties that are predicted with high certainty by diagnosis histories de facto diagnosis specialties. The objective of our research is to discover de facto diagnosis specialties under a general discovery-evaluation framework. Specifically, we employ a semi-supervised learning model (based on heterogeneous information network analysis) and an unsupervised learning method (based on topic modeling) for discovery. We further employ four supervised learning models for evaluation. We use one year of diagnosis histories from a major medical center, which consists of two data sets. One is fine-grained and has diagnoses assigned to 41,603 patients that are accessed by 2,504 medical service providers. The other is general and has diagnoses assigned to 291,562 patients that are accessed by 3,269 medical service providers. The semi-supervised learning model discovers a specialty for Breast Cancer on the fine-grained data set; while the unsupervised learning method confirms this discovery and suggests another specialty for Obesity on the larger general data set. The evaluation results reinforce that these two specialties can be recognized accurately by supervised learning models in comparison with 12 common diagnosis specialties defined by the Health Care Provider Taxonomy Code Set.