Accurate assessment of memory ability for persons on the continuum of Alzheimer’s disease (AD) is vital for early diagnosis, monitoring of disease progression and evaluation of new therapies. However, currently available neuropsychological tests suffer from a lack of standardization and metrological quality assurance. Improved metrics of memory can be created by carefully combining selected items from legacy short-term memory tests, whilst at the same time retaining validity, and reducing patient burden. In psychometrics, this is known as “crosswalks” to link items empirically. The aim of this paper is to link items from different types of memory tests. Memory test data were collected from the European EMPIR NeuroMET and the SmartAge studies recruited at Charité Hospital (Healthy controls n = 92; Subjective cognitive decline n = 160; Mild cognitive impairment n = 50; and AD n = 58; age range 55–87). A bank of items (n = 57) was developed based on legacy short-term memory items (i.e., Corsi Block Test, Digit Span Test, Rey’s Auditory Verbal Learning Test, Word Learning Lists from the CERAD test battery and Mini Mental State Examination; MMSE). The NeuroMET Memory Metric (NMM) is a composite metric that comprises 57 dichotomous items (right/wrong). We previously reported on a preliminary item bank to assess memory based on immediate recall, and have now demonstrated direct comparability of measurements generated from the different legacy tests. We created crosswalks between the NMM and the legacy tests and between the NMM and the full MMSE using Rasch analysis (RUMM2030) and produced two conversion tables. Measurement uncertainties for estimates of person memory ability with the NMM across the full span were smaller than all individual legacy tests, which demonstrates the added value of the NMM. Comparisons with one (MMSE) of the legacy tests showed however higher measurement uncertainties of the NMM for people with a very low memory ability (raw score ≤ 19). The conversion tables developed through crosswalks in this paper provide clinicians and researchers with a practical tool to: (i) compensate for ordinality in raw scores, (ii) ensure traceability to make reliable and valid comparisons when measuring person ability, and (iii) enable comparability between test results from different legacy tests.