Endoscopic autofluorescence lifetime imaging is a promising technique for making quantitative and non-invasive diagnoses of abnormal tissue. However, motion artifacts caused by vibration in the direction perpendicular to the tissue surface in a body makes clinical diagnosis difficult. Thus, this paper proposes a robust autofluorescence lifetime sensing technique with a lens tracking system based on a laser beam spot analysis. Our optical setup can be easily mounted on the head of an endoscope. The variation in distance between the optical system and the target surface is tracked by the change in the spot size of the laser beam captured by the camera, and the lens actuator is feedback-controlled to suppress motion artifacts. The experimental results show that, when using a lens tracking system, the standard deviation of fluorescence lifetime is dramatically reduced. Furthermore, the validity of the proposed method is experimentally confirmed by using a bio-mimicking phantom that replicates the shape, optical parameters, and chemical component distribution of the cancerous tissue.