In this paper, a recently proposed approach to multizone sound field synthesis, referred to as joint pressure and velocity matching (JPVM), is investigated analytically using a spherical harmonics representation of the sound field. The approach is motivated by the Kirchhoff-Helmholtz integral equation and aims at controlling the sound field inside the local listening zones by evoking the sound pressure and particle velocity on surrounding contours. Based on the findings of the modal analysis, an improved version of JPVM is proposed, which provides both better performance and lower complexity. In particular, it is shown analytically that the optimization of the tangential component of the particle velocity vector, as is done in the original JPVM approach, is very susceptible to errors and thus not pursued anymore. Furthermore, the analysis provides fundamental insights as to how the spherical harmonics used to describe three-dimensional sound fields translate into two-dimensional basis functions as observed on the contours surrounding the zones. By means of simulations, it is verified that discarding the tangential component of the particle velocity vector ultimately leads to an improved performance. Finally, the impact of sensor noise on the reproduction performance is assessed.