Agricultural robotics is a complex, challenging, and exciting research topic nowadays. However, orchard environments present harsh conditions for robotics operability, such as terrain irregularities, illumination, and inaccuracies in GPS signals. To overcome these challenges, reliable landmarks must be extracted from the environment. This study addresses the challenge of accurate, low-cost, and efficient landmark identification in orchards to enable robot row-following. First, deep learning, integrated with depth information, is used for real-time trunk detection and location. The in-house dataset used to train the models includes a total of 2453 manually annotated trunks. The results show that the trunk detection achieves an overall mAP of 81.6%, an inference time of 60 ms, and a location accuracy error of 9 mm at 2.8 m. Secondly, the environmental features obtained in the first step are fed into the DWA. The DWA performs reactive obstacle avoidance while attempting to reach the row-end destination. The final solution considers the limitations of the robot’s kinematics and dynamics, enabling it to maintain the row path and avoid obstacles. Simulations and field tests demonstrated that even with a certain initial deviation, the robot could automatically adjust its position and drive through the rows in the real orchard.