BackgroundAccurately determining the epidemiology of dermatological diseases such as hidradenitis suppurativa (HS), psoriasis (PsO), chronic urticaria (CU) and/or atopic dermatitis (AD) is challenging due to variations in prevalence and disease severity in the reported literature.ObjectivesThe DERMACLEAR study aims to use natural language processing (NLP) to assess the proportions of patients with HS, PsO, CU and/or AD, and obtain information on patient profiles, patient journeys, and disease and healthcare burden in Spain. Here, the study design and objectives of the DERMACLEAR study are described and the precision of the NLP system used is assessed.MethodsThis study will retrospectively collect patient information from electronic health records (EHRs) at dermatology departments from seven tertiary hospitals in Spain. The NLP system was developed by IOMED Medical Solutions and was verified internally (IOMED scientific team) and externally (principal investigators of each hospital) to determine its precision in identifying patients with HS, PsO, CU and/or AD. Furthermore, internal verification was performed on other medical variables relevant to the study.ResultsTo date, the DERMACLEAR study has retrospectively collected data from 54,458 patients with HS, PsO, CU and/or AD (HS: 5045; PsO: 32,559; CU: 8397; AD: 12,492). The average precision of the NLP system to identify patients diagnosed with HS, PsO, CU, and/or AD across all hospitals exceeded 95% via external and internal verification.ConclusionsResults from the DERMACLEAR study will increase the real‐world evidence of clinical practice, obtaining a large amount of information on patients with the studied diseases. The NLP system used is precise in identifying patients diagnosed with HS, PsO, CU and/or AD, and other medical variables from EHRs, highlighting that it is a valid system to use in the DERMACLEAR study.