Introduction. Non-Rigid Shape-from-Motion (NRSfM) is the general solution to the 3D reconstruction from multiple monocular images of deforming objects. Most previous attempts in NRSfM have been on learning a low dimensional shape basis from a set of contiguous images. NRSfM is very much related to the Shape-from-Template (SfT) problem, where shape is computed from a known 3D template and a single input image after deformation. Most SfT methods have been based on isometric deformations [1,2]. Thus applying NRSfM in isometrically constrained deformations is a natural way forward. However, there has been a gap in the literature regarding the theory behind isometric NRSfM. Many of the isometric NRSfM solutions also have practical problems. Apart from that, most of the recent works in NRSfM are based on orthographic camera models.[3] uses the orthographic camera to recover the shape's normal locally; they suffer from local two-fold ambiguities and significantly degrade for shorter focal lengths.[5] recently solved the same problem for an orthographic and perspective camera.[4] specifically addresses the case of piecewise planar surfaces; it uses the perspective camera but still has patch-wise two-fold unresolved ambiguities induced by the processing of image pairs.In the paper, we present a general framework to solve Non-Rigid Shape-from-Motion (NRSfM) with the perspective camera for isometric deformations. Isometry allows solving for complex shape deformations from a sparse set of images. First, we formulate isometric NRSfM as a system of first-order Partial Differential Equations (PDE) involving the shape's depth and normal field and an unknown template. Second, we show the system cannot be locally resolved as such. Third, we introduce the concept of infinitesimal planarity and show that it makes the system locally solvable for three or more views. Finally, we derive an analytical solution which involves convex, linear least-squares optimization only, outperforming existing work on challenging datasets.