Background To compare the performance of eight frailty instruments to identify relevant adverse outcomes for older people across different settings over a 12 month follow-up. Methods Observational longitudinal prospective study of people aged 75 + years enrolled in different settings (acute geriatric wards, geriatric clinic, primary care clinics, and nursing homes) across five European cities. Frailty was assessed using the following: Frailty Phenotype, SHARE-FI, 5-item Frailty Trait Scale (FTS-5), 3-item FTS (FTS-3), FRAIL scale, 35-item Frailty Index (FI-35), Gérontopôle Frailty Screening Tool, and Clinical Frailty Scale. Adverse outcomes ascertained at follow-up were as follows: falls, hospitalization, increase in limitation in basic (BADL) and instrumental activities of daily living (IADL), and mortality. Sensitivity, specificity, and capacity to predict adverse outcomes in logistic regressions by each instrument above age, gender, and multimorbidity were calculated. Results A total of 996 individuals were followed (mean age 82.2 SD 5.5 years, 61.3% female). In geriatric wards, the FI-35 (69.1%) and the FTS-5 (67.9%) showed good sensitivity to predict death and good specificity to predict BADL worsening (70.3% and 69.8%, respectively). The FI-35 also showed good sensitivity to predict BADL worsening (74.6%). In nursing homes, the FI-35 and the FTSs predicted mortality and BADL worsening with a sensitivity > 73.9%. In geriatric clinic, the FI-35, the FTS-5, and the FRAIL scale obtained specificities > 85% to predict BADL worsening. No instrument achieved high enough sensitivity nor specificity in primary care. All the instruments predict the risk for all the outcomes in the whole sample after adjusting for age, gender, and multimorbidity. The associations of these instruments that remained significant by setting were for BADL worsening in geriatric wards [