BackgroundFew datasets have been established that capture the full breadth of COVID-19 patient interactions with a health system. Our first objective was to create a COVID-19 dataset that linked primary care data to COVID-19 testing, hospitalisation, and mortality data at a patient level. Our second objective was to provide a descriptive analysis of COVID-19 outcomes among the general population and describe the characteristics of the affected individuals.MethodsWe mapped patient-level data from Catalonia, Spain, to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). More than 3,000 data quality checks were performed to assess the readiness of the database for research. Subsequently, to summarise the COVID-19 population captured, we established a general population cohort as of the 1st March 2020 and identified outpatient COVID-19 diagnoses or positive test results for SARS-CoV-2, hospitalisations with COVID-19, and COVID-19 deaths during follow-up, which went up until 30th June 2021.FindingsMapping data to the OMOP CDM was performed and high data quality was observed. The mapped database was used to identify a total of 5,870,274 individuals, who were included in the general population cohort as of 1st March 2020. Over follow up, 604,472 had either an outpatient COVID-19 diagnosis or positive test result, 58,991 had a hospitalisation with COVID-19, 5,642 had an ICU admission with COVID-19, and 11,233 had a COVID-19 death. People who were hospitalised or died were more commonly older, male, and with more comorbidities. Those admitted to ICU with COVID-19 were generally younger and more often male than those hospitalised in general and those who died.InterpretationWe have established a comprehensive dataset that captures COVID-19 diagnoses, test results, hospitalisations, and deaths in Catalonia, Spain. Extensive data checks have shown the data to be fit for use. From this dataset, a general population cohort of 5.9 million individuals was identified and their COVID-19 outcomes over time were described.FundingGeneralitat de Catalunya and European Health Data and Evidence Network (EHDEN).