Automating building processes through robotic systems has the potential to address the need for safer, more efficient, and sustainable construction operations. While ongoing research effort often targets the use of prefabricated materials in controlled environments, here we focus on utilizing objects found on‐site, such as irregularly shaped rocks and rubble, as a way of enabling novel types of construction in remote and extreme environments, where standard building materials might not be easily accessible. In this article, we present a perception and grasp pose planning pipeline for autonomous manipulation of objects of interest with a robotic walking excavator. The system incrementally builds a LiDAR‐based map of the robot's surroundings and provides the ability to register externally reconstructed point clouds of the scene, for example, from images captured by a drone‐borne camera, which helps increasing map coverage. In addition, object‐like instances, such as stones, are segmented out of this map. Based on this information, collision‐free grasping poses for the robotic manipulator are planned to enable picking and placing of these objects, while keeping track of them during the manipulation. The approach is validated in a real setting on an architectural relevant scale by segmenting and manipulating boulders of several hundred kilograms, which is a first step towards the full automation of dry‐stack wall building processes. Video – https://youtu.be/4bc5n2-zj3Q