Hand tracking algorithms relying on a single camera as the sensing device can only provide relative depth information, resulting in limited practicality. This limitation underscores the necessity for effective and accurate estimation of the absolute distances between hand joints and the camera in the real world. We respond to this pressing need by introducing a methodology that exploits the autofocus functionality of a camera for hand tracking. It takes advantage of the unutilized potential of a camera and removes the need for additional power-demanding and costly depth sensors to accurately estimate the absolute distances of hand joints. Our methodology undergoes rigorous experimental validation and consistently outperforms traditional methods across different lens positions.