Lunar in situ resource utilization is a core goal in lunar exploration, with accurate lunar rock pose estimation being essential. To address the challenges posed by the lack of texture features and extreme lighting conditions, this study proposes the Simulation-YOLO-Hourglass-Transformer (SYHT) method. The method enhances accuracy and robustness in complex lunar environments, demonstrating strong adaptability and excellent performance, particularly in conditions of extreme lighting and scarce texture. This approach provides valuable insights for object pose estimation in lunar exploration tasks and lays the foundation for lunar resource development. First, the YOLO-Hourglass-Transformer (YHT) network is used to extract keypoint information from each rock and generate the corresponding 3D pose. Then, a lunar surface imaging physics simulation model is employed to generate simulated lunar rock data for testing the method. The experimental results show that the SYHT method performs exceptionally well on simulated lunar rock data, achieving a mean per-joint position error (MPJPE) of 37.93 mm and a percentage of correct keypoints (PCK) of 99.94%, significantly outperforming existing methods. Finally, transfer learning experiments on real-world datasets validate its strong generalization capability, highlighting its effectiveness for lunar rock pose estimation in both simulated and real lunar environments.