BackgroundThe examination of psychometric properties in instruments measuring abuse of older people (AOP) is a crucial area of study that has, unfortunately, received relatively little attention. Poor psychometric properties in AOP measurement instruments can significantly contribute to inconsistencies in prevalence estimates, casting a shadow of uncertainty over the magnitude of the problem at national, regional, and global levels.ObjectivesThis review rigorously employed the Consensus‐based Standards for the Selection of Health Measurement Instruments (COSMIN) guideline on the quality of outcome measures. It was designed to identify and review the instruments used to measure AOP, assess the instruments' measurement properties, and identify the definitions of AOP and abuse subtypes measured by these instruments, ensuring the reliability and validity of the findings.Search MethodsA comprehensive search was conducted up to May 2023 across various online databases, including AgeLine via EBSCOhost, ASSIA via ProQuest, CINAHL via EBSCOhost, EMBASE, LILACS, ProQuest Dissertation & Theses Global, PsycINFO via EBSCOhost, PubMed, SciELO, Scopus, Sociological Abstract via ProQuest, Chinese National Knowledge Infrastructure (CNKI), Google Scholar and WHO Global Index Medicus. Additionally, relevant studies were identified by thoroughly searching the grey literature from resources such as Campbell Collaboration, OpenAIRE, and GRAFT.Selection CriteriaAll quantitative, qualitative (addressing face and content validity), and mixed‐method empirical studies published in peer‐reviewed journals or grey literature were included in this review. The included studies were primary studies that (1) evaluated one or more psychometric properties, (2) contained information on instrument development, or (3) examined the content validity of the instruments designed to measure AOP in community or institutional settings. The selected studies describe at least one psychometric property: reliability, validity, and responsiveness. Study participants represent the population of interest, including males and females aged 60 or older in community or institutional settings.Data Collection and AnalysisTwo reviewers evaluated the screening of the selected studies' titles, abstracts, and full texts based on the preset selection criteria. Two reviewers assessed the quality of each study using the COSMIN Risk of Bias checklist and the overall quality of evidence for each psychometric property of the instrument against the updated COSMIN criteria of good measurement properties. Disagreements were resolved through consensus discussion or with assistance from a third reviewer. The overall quality of the measurement instrument was graded using a modified GRADE approach. Data extraction was performed using data extraction forms adapted from the COSMIN Guideline for Systematic Reviews of Outcome Measurement Instruments. The extracted data included information on the characteristics of included instruments (name, adaptation, language used, translation and country of origin), characteristics of the tested population, instrument development, psychometric properties listed in the COSMIN criteria, including details on content validity, structural validity, internal consistency, cross‐cultural validity/measurement invariance, reliability, measurement error, criterion validity, hypotheses testing for construct validity, responsiveness, and interoperability. All data were synthesised and summarised qualitatively, and no meta‐analysis was performed.Main ResultsWe found 15,200 potentially relevant records, of which 382 were screened in full text. A total of 114 studies that met the inclusion criteria were included. Four studies reported on more than one instrument. The primary reasons for excluding studies were their focus on instruments used solely for screening and diagnostic purposes, those conducted in hospital settings, or those without evaluating psychometric properties. Eighty‐seven studies reported on 46 original instruments and 29 studies on 22 modified versions of an original instrument. The majority of the studies were conducted in community settings (97 studies) from the perspective of older adults (90 studies) and were conducted in high‐income countries (69 studies). Ninety‐five studies assessed multiple forms of abuse, ranging from 2 to 13 different subscales; four studies measured overall abuse and neglect among older adults, and 14 studies measured one specific type of abuse. Approximately one‐quarter of the included studies reported on the psychometric properties of the most frequently used measurement instruments: HS‐EAST (assessed in 11 studies), VASS‐12 items (in 9 studies), and CASE (in 9 studies). The instruments with the most evidence available in studies reporting on instrument development and content validity in all domains (relevance, comprehensiveness and comprehensibility) were the DEAQ, OAPAM, *RAAL‐31 items, *ICNH (Norwegian) and OAFEM. For other psychometric properties, instruments with the most evidence available in terms of the number of studies were the HS‐EAST (11 studies across 5 of 9 psychometric properties), CASE (9 studies across 6 of 9 psychometric properties), VASS‐12 items (9 studies across 5 of 9 psychometric properties) and GMS (5 studies across 4 of 9 psychometric properties). Based on the overall rating and quality of evidence, the psychometric properties of the AOP measurement instruments used for prevalence measurement in community and institutional settings were insufficient and of low quality.Authors' ConclusionsThis review aimed to assess the overall rating and quality of evidence for instruments measuring AOP in the community and institutional settings. Our findings revealed various measurement instruments, with ratings and evidence quality predominantly indicating insufficiency and low quality. In summary, the psychometric properties of AOP measurement instruments have not been comprehensively investigated, and existing instruments lack sufficient evidence to support their validity and reliability.