Neurodegenerative diseases pose a formidable challenge to medical research, demanding a nuanced understanding of their progressive nature. In this regard, latent generative models can effectively be used in a data-driven modeling of different dimensions of neurodegeneration, framed within the context of the manifold hypothesis. This paper proposes a joint framework for a multi-modal, common latent generative model to address the need for a more comprehensive understanding of the neurodegenerative landscape in the context of Parkinson’s disease (PD). The proposed architecture uses coupled variational autoencoders (VAEs) to joint model a common latent space to both neuroimaging and clinical data from the Parkinson’s Progression Markers Initiative (PPMI). Alternative loss functions, different normalization procedures, and the interpretability and explainability of latent generative models are addressed, leading to a model that was able to predict clinical symptomatology in the test set, as measured by the unified Parkinson’s disease rating scale (UPDRS), with R2 up to 0.86 for same-modality and 0.441 cross-modality (using solely neuroimaging). The findings provide a foundation for further advancements in the field of clinical research and practice, with potential applications in decision-making processes for PD. The study also highlights the limitations and capabilities of the proposed model, emphasizing its direct interpretability and potential impact on understanding and interpreting neuroimaging patterns associated with PD symptomatology.