Everyday robotics are challenged to deal with autonomous product handling in applications like logistics or retail, possibly causing damage on the items during manipulation. Traditionally, most approaches try to minimize physical interaction with goods. However, this paper proposes to take into account any unintended object motion and to learn damage-minimizing manipulation strategies in a self-supervised way. The presented approach consists of a simulation-based planning method for an optimal manipulation sequence with respect to possible damage. The planned manipulation sequences are generalized to new, unseen scenes in the same application scenario using machine learning. This learned manipulation strategy is continuously refined in a self-supervised, simulationin-the-loop optimization cycle during load-free times of the system, commonly known as mental simulation. In parallel, the generated manipulation strategies can be deployed in near-real time in an anytime fashion. The approach is validated on an industrial container-unloading scenario and on a retail shelf-replenishment scenario.