Catastrophe models estimate risk at the intersection of hazard, exposure, and vulnerability. Each of these areas requires diverse sources of data, which are very often incomplete, inconsistent, or missing altogether. The poor quality of the data is a source of epistemic uncertainty, which affects the vulnerability models as well as the output of the catastrophe models. This article identifies the different sources of epistemic uncertainty in the data, and elaborates on strategies to reduce this uncertainty, in particular through identification, augmentation, and integration of the different types of data. The challenges are illustrated through the Florida Public Hurricane Loss Model (FPHLM), which estimates insured losses on residential buildings caused by hurricane events in Florida. To define the input exposure, and for model development, calibration, and validation purposes, the FPHLM teams accessed three main sources of data: county tax appraiser databases, National Flood Insurance Protection (NFIP) portfolios, and wind insurance portfolios. The data from these different sources were reformatted and processed, and the insurance databases were separately cross-referenced at the county level with tax appraiser databases. The FPHLM hazard teams assigned estimates of natural hazard intensity measure to each insurance claim. These efforts produced an integrated and more complete set of building descriptors for each policy in the NFIP and wind portfolios. The article describes the impact of these uncertainty reductions on the development and validation of the vulnerability models, and suggests avenues for data improvement. Lessons learned should be of interest to professionals involved in disaster risk assessment and management.