Dementia is a neurodegenerative disease which leads to the individual experiencing difficulties in their daily lives. Often these difficulties cause a large amount of stress, frustration and upset in the individual, however identifying when the difficulties are occurring or beginning can be difficult for caregivers, until the difficulty has caused problematic behavior or undeniable difficulty to the person with dementia. Therefore, a system for identifying the onset of dementia-related difficulties would be helpful in the management of dementia. Previous work highlighted wearable computing-based systems for analyzing physiological data as particularly promising. In this paper, we outline the methodology used to perform a systematic search for a relevant dataset. However, no such dataset was found. As such, a methodology for collecting such a dataset and making it publicly available is proposed, as well as for using it to train classification models that can predict difficulties from the physiological data. Several solutions to overcome the lack of available data are identified and discussed: data collection experiments to collect novel datasets; anonymization and pseudonymization to remove all identifiable data from the dataset; and synthetic data generation to produce a larger, anonymous training dataset. In conclusion, a combination of all the identified methods should ideally be employed in future solutions. Future work should focus on the conductance of the proposed experiment and the sharing of the collected data in the manner proposed, with data ideally being collected from as many people as possible with as many different types of dementia as possible.