Tensegrity robots, which are prototypical examples of hybrid soft–rigid robots, exhibit dynamical properties that provide ruggedness and adaptability. They also bring about, however, major challenges for locomotion control. Owing to high dimensionality and the complex evolution of contact states, data-driven approaches are appropriate for producing viable feedback policies for tensegrities. Guided policy search (GPS), a sample-efficient hybrid framework for optimization and reinforcement learning, has previously been applied to generate periodic, axis-constrained locomotion by an icosahedral tensegrity on flat ground. Varying environments and tasks, however, create a need for more adaptive and general locomotion control that actively utilizes an expanded space of robot states. This implies significantly higher needs in terms of sample data and setup effort. This work mitigates such requirements by proposing a new GPS -based reinforcement learning pipeline, which exploits the vehicle’s high degree of symmetry and appropriately learns contextual behaviors that are sustainable without periodicity. Newly achieved capabilities include axially unconstrained rolling, rough terrain traversal, and rough incline ascent. These tasks are evaluated for a small variety of key model parameters in simulation and tested on the NASA hardware prototype, SUPERball. Results confirm the utility of symmetry exploitation and the adaptability of the vehicle. They also shed light on numerous strengths and limitations of the GPS framework for policy design and transfer to real hybrid soft–rigid robots.