We introduce a mesoscopic model of pedestrian group behaviour, in which the internal group dynamics is modelled using a microscopic potential, while the effect of the environment is modelled using a harmonic term whose intensity depends on a macroscopic quantity, crowd density. We show that, in order to properly describe the behaviour of 2-person groups, the harmonic term is directed orthogonally to the walking direction, and its intensity grows linearly with density. We also show that, once calibrated on 2-person groups, the model correctly predicts the velocity and spatial extension of 3-person groups in the walking direction, while in order to describe properly also the abreast extension of 3-person groups a modification in the microscopic group dynamics has to be introduced. The model also correctly predicts the presence of a bifurcation phenomenon, namely the emergence of a stable 3-person Λ configuration at high densities, while only the V formation is stable at low densities.