Objective: The Trans Woman Voice Questionnaire (TWVQ) is commonly used to quantify self-perceptions of voice for trans women seeking gender-affirming voice care, but the interpretation of TWVQ scores remains challenging. The objective of this study was to use item-response theory (IRT) to evaluate the relationship between TWVQ items and persons on a common scale and identify improvements to increase the meaningfulness of TWVQ scores.Methods: A retrospective review of TWVQ scores from trans women patients between 2018-2020 was performed. Rasch-family models were used to generate item-person maps positioning respondent location and item difficulty estimates on a logit scale, which was then converted into a scaled score using linear transformations.Results: TWVQ responses from 86 patients were analyzed. Initial item-person maps demonstrated that the middle response categories ("sometimes" and "often") performed inconsistently across items (poor threshold banding); interpretability improved when these ratings were scored as one category. The models were rerun using revised scoring, which retained high reliability (0.93) and supported a unidimensional construct. Updated item-person maps revealed four scaled score zones (≤54, >54 to ≤101, >101 to ≤140, and >140) that each corresponded to an increasing pattern of item thresholds (probability of selecting one response category vs. others). These ranges can be interpreted as minimal, low, moderate, and high, respectively.Conclusions: Empiric data from Rasch analysis supports new interval scoring for the TWVQ that advances the clinical and research utility of the instrument and lays the foundation for future improvements in clinical care and outcomes assessment.