In computer vision, most monovision cameras used for estimating the position of an object only estimate the 2D information of the object without the depth information. Estimating the depth information, which is the distance between the target object and the camera is quite challenging but, in this paper, a less computationally intensive method was used to estimate the object’s distance to complete the 3D information needed to determine the object’s location in cartesian space. In this method, the object was positioned in front of the camera at a sequential distance and was measured directly. The distances measured in the experiment with a set of training data obtained from the image were fitted into a curve using the least-square framework to derive a non-linear function that was used for estimating the object’s distance also known as the z-coordinate. The result from the experiment showed that there was an average error of 1.33 mm between the actual distance and the estimated distance of the object. Hence, this method can be applied in many robotic and autonomous systems applications.