Background. Assessment methods for atopic dermatitis (AD) are not standardized, and therapeutic studies are difficult to interpret. Aims. To obtain a consensus on assessment methods in AD and to use a statistical method to develop a composite severity index.Methods. Consensus definitions were given for items used in the scoring system (extent, intensity, subjective) and illustrated for intensity items. Slides were reviewed to address within and between-observer variability by a group of 10 trained clinicians, and data were statistically evaluated with a two way analysis of variance. Two variants of an assessment system were compared in 88 patients at 5 different institutions. Data were analyzed using principal-component analysis. Results. For 5 intensity items studied (erythema, edema/papulation, oozing/crusts, excoriations, lichenification), within- and between-observer variability was good overall, except for edema/papulation which was difficult to assess with slides. In the series of 88 patients, principal-component analysis allowed to extract two unrelated components: the first one accounting for 33% of total variance was interpreted as a ‘severity’ component; the second one, accounting for 18% of variance, was interpreted as a ‘profile’ component distinguishing patients with mostly erythema and subjective symptoms and those with mostly lichenification and dryness and lower subjective symptoms. Of the two evaluation systems used, the one using the rule of nine to assess extent was found more workable than the one using a distribution × intensity product. A scoring index (SCORAD) combining extent, severity and subjective symptoms was mathematically derived from the first system and showed a normal distribution of the population studied. Conclusion. The final choice for the evaluation system was mostly made based on simplicity and easy routine use in outpatient clinics. Based on mathematical appreciation of weights of the items used in the assessment of AD, extent and subjective symptoms account for around 20% each of the total score, intensity items representing 60%. The so-designed composite index SCORAD needs to be further tested in clinical trials.