Endobronchial intervention is increasingly used as a minimal invasive means of lung intervention. Vision-based localization approaches are often sensitive to image artifacts in bronchoscopic videos. In this paper, a robust navigation system based on a context-aware depth recovery approach for monocular video images is presented. To handle the artifacts, a conditional generative adversarial learning framework is proposed for reliable depth recovery. The accuracy of depth estimation and camera localization is validated on an in vivo dataset. Both quantitative and qualitative results demonstrate that the depth recovered with the proposed method preserves better structural information of airway lumens in the presence of image artifacts, and the improved camera localization accuracy demonstrates its clinical potential for bronchoscopic navigation.