Abstract-Inter-object depth estimation is always a major concern for micromanipulation using scanning electron microscope (SEM). So far, various methods have been proposed for estimating this depth based on stereoscopic imaging. Most of them require external hardware unit or manual interaction during the process. In this paper, using the image focus information, different methods are presented for estimating the inter-object depth for micromanipulation and the local pixel point depth for 3D shape reconstruction. In both cases, the normalized variance has been used as the sharpness criteria. For interobject depth estimation, a visual servoing-based autofocusing method has been used to maximize the sharpness in object region windows. For Shape reconstruction, a stack of images are acquired by varying the working distance. These images are processed to find the maximum sharpness of each pixel and consequently reconstructing the surface. Developments are validated in a robotic handling scenario where the scene contains a microgripper and silicon microstructures.