Abstract-Creating robots that can act autonomously in dynamic, unstructured environments requires dealing with novel objects. Thus, an off-line learning phase is not sufficient for recognizing and manipulating such objects. Rather, an autonomous robot needs to acquire knowledge through its own interaction with its environment, without using heuristics encoding human insights about the domain. Interaction also allows information that is not present in static images of a scene to be elicited. Out of a potentially large set of possible interactions, a robot must select actions that are expected to have the most informative outcomes to learn efficiently. In the proposed bottom-up, probabilistic approach, the robot achieves this goal by quantifying the expected informativeness of its own actions in information-theoretic terms. We use this approach to segment a scene into its constituent objects. We retain a probability distribution over segmentations. We show that this approach is robust in the presence of noise and uncertainty in real-world experiments. Evaluations show that the proposed information-theoretic approach allows a robot to efficiently determine the composite structure of its environment. We also show that our probabilistic model allows straightforward integration of multiple modalities, such as movement data and static scene features. Learned static scene features allow for experience from similar environments to speed up learning for new scenes.