Currently, elective clinical target volume (CTV-N) definition for head and neck squamous cell carcinoma (HNSCC) is mostly based on the prevalence of nodal involvement for a given tumor location. In this work, we propose a probabilistic model for lymphatic metastatic spread that can quantify the risk of microscopic involvement in lymph node levels (LNL) given the location of macroscopic metastases and T-category. This may allow for further personalized CTV-N definition based on an individual patient’s state of disease. We model the patient's state of metastatic lymphatic progression as a collection of hidden binary random variables that indicate the involvement of LNLs. In addition, each LNL is associated with observed binary random variables that indicate whether macroscopic metastases are detected. A hidden Markov model (HMM) is used to compute the probabilities of transitions between states over time. The underlying graph of the HMM represents the anatomy of the lymphatic drainage system. Learning of the transition probabilities is done via Markov chain Monte Carlo sampling and is based on a dataset of HNSCC patients in whom involvement of individual LNLs was reported. The model is demonstrated for ipsilateral metastatic spread in oropharyngeal HNSCC patients. We demonstrate the model's capability to quantify the risk of microscopic involvement in levels III and IV, depending on whether macroscopic metastases are observed in the upstream levels II and III, and depending on T-category. In conclusion, the statistical model of lymphatic progression may inform future, more personalized, guidelines on which LNL to include in the elective CTV. However, larger multi-institutional datasets for model parameter learning are required for that.