Scene modeling is very crucial for robots that need to perceive, reason about and manipulate the objects in their environments. In this paper, we adapt and extend Boltzmann Machines (BMs) for contextualized scene modeling. Although there are many models on the subject, ours is the first to bring together objects, relations, and affordances in a highly-capable generative model. For this end, we introduce a hybrid version of BMs where relations and affordances are incorporated with shared, tri-way connections into the model. Moreover, we introduce a dataset for relation estimation and modeling studies. We evaluate our method in comparison with several baselines on object estimation, out-ofcontext object detection, relation estimation, and affordance estimation tasks.Moreover, to illustrate the generative capability of the model, we show several example scenes that the model is able to generate, and demonstrate the benefits of the model on a humanoid robot. The code and the dataset are publicly made available at: https://github.com/bozcani/COSMO