BackgroundThe Swallowing Quality-of-Life Questionnaire (SWAL-QoL) is considered the gold standard for assessing health-related QoL in oropharyngeal dysphagia. The Dutch translation (DSWAL-QoL) and its adjusted version (aDSWAL-QoL) have been validated using classical test theory (CTT). However, these scales have not been tested against the Rasch measurement model, which is required to establish the structural validity and objectivity of the total scale and subscale scores. Thus, the purpose of this study was to examine the psychometric properties of these scales using item analysis according to the Rasch model.MethodsItem analysis with the Rasch model was performed using RUMM2030 software with previously collected data from a validation study of 108 patients. The assessment included evaluations of overall model fit, reliability, unidimensionality, threshold ordering, individual item and person fits, differential item functioning (DIF), local item dependency (LID) and targeting.ResultsThe analysis could not establish the psychometric properties of either of the scales or their subscales because they did not fit the Rasch model, and multidimensionality, disordered thresholds, DIF, and/or LID were found. The reliability and power of fit were high for the total scales (PSI = 0.93) but low for most of the subscales (PSI < 0.70). The targeting of persons and items was suboptimal. The main source of misfit was disordered thresholds for both the total scales and subscales. Based on the results of the analysis, adjustments to improve the scales were implemented as follows: disordered thresholds were rescaled, misfit items were removed and items were split for DIF. However, the multidimensionality and LID could not be resolved. The reliability and power of fit remained low for most of the subscales.ConclusionsThis study represents the first analyses of the DSWAL-QoL and aDSWAL-QoL with the Rasch model. Relying on the DSWAL-QoL and aDSWAL-QoL total and subscale scores to make conclusions regarding dysphagia-related HRQoL should be treated with caution before the structural validity and objectivity of both scales have been established. A larger and well-targeted sample is recommended to derive definitive conclusions about the items and scales. Solutions for the psychometric weaknesses suggested by the model and practical implications are discussed.Electronic supplementary materialThe online version of this article (doi:10.1186/s12955-017-0639-3) contains supplementary material, which is available to authorized users.