Simple SummaryAnimal welfare is a very emotional issue. It is therefore necessary to measure it objectively. As welfare includes different components such as the health status, the behaviour and the emotional state, different indicators are needed for its assessment. A two-level approach is proposed in the Animal Welfare Indicators (AWIN) assessment protocol for horses; the first level providing a fast overview and the second more details. The aim of this study was to give an indication whether this two-level approach produces reliable results, i.e., whether the first level assessment does indeed provide a good overview or whether too many welfare issues remain undetected. Therefore, a trained observer performed 112 first and second level assessments directly following each other. The results were compared based on the agreement between the two levels. In this study, based on one observer, overall, the first level did provide a good overview of the welfare status. Adaption of some of the indicators of the first level assessment might be necessary. Nevertheless, this two-level approach enhances feasibility and there is indication that it is a reliable approach. Therewith, this approach might also be interesting for implementation in other welfare assessment schemes.AbstractTo enhance feasibility, the Animal Welfare Indicators (AWIN) assessment protocol for horses consists of two levels: the first is a visual inspection of a sample of horses performed from a distance, the second a close-up inspection of all horses. The aim was to analyse whether information would be lost if only the first level were performed. In this study, 112 first and 112 second level assessments carried out on a subsequent day by one observer were compared by calculating the Spearman’s Rank Correlation Coefficient (RS), Intraclass Correlation Coefficients (ICC), Smallest Detectable Changes (SDC) and Limits of Agreements (LoA). Most indicators demonstrated sufficient reliability between the two levels. Exceptions were the Horse Grimace Scale, the Avoidance Distance Test and the Voluntary Human Approach Test (e.g., Voluntary Human Approach Test: RS: 0.38, ICC: 0.38, SDC: 0.21, LoA: −0.25–0.17), which could, however, be also interpreted as a lack of test-retest reliability. Further disagreement was found for the indicator consistency of manure (RS: 0.31, ICC: 0.38, SDC: 0.36, LoA: −0.38–0.36). For these indicators, an adaptation of the first level would be beneficial. Overall, in this study, the division into two levels was reliable and might therewith have the potential to enhance feasibility in other welfare assessment schemes.