A number of methods in tracking and recognition have successfully exploited lowdimensional representations of object appearance learned from a set of examples. In all these approaches, the construction of the underlying low-dimensional manifold relied upon obtaining different instances of the object's appearance and then using statistical data analysis tools to approximate the appearance space. This requires collecting a very large number of examples and the accuracy of the method depends upon the examples that have been chosen. In this chapter, we show that it is possible to estimate low-dimensional manifolds that describe object appearance using a combination of analytically derived geometrical models and statistical data analysis. Specifically, we derive a quadrilinear space of object appearance that is able to represent the effects of illumination, motion, identity and shape. We then show how efficient tracking algorithms like inverse compositional estimation can be adapted to the geometry of this manifold. Our proposed method significantly reduces the amount of data that needs to be collected for learning the manifolds and makes the learned manifold less dependent upon the actual examples that were used. Based upon this novel manifold, we present a framework for face recognition from video sequences that is robust to large changes in facial pose and lighting conditions. The method can handle situations where the pose and lighting conditions in the training and testing data are completely disjoint. We show detailed performance analysis results and recognition scores on a large video dataset.