Nowadays, robots are dominating the manufacturing, entertainment, and healthcare industries. Robot vision aims to equip robots with the capabilities to discover information, understand it, and interact with the environment, which require an agent to effectively understand object affordances and functions in complex visual domains. In this literature survey, first, “visual affordances” are focused on and current state-of-the-art approaches for solving relevant problems as well as open problems and research gaps are summarized. Then, sub-problems, such as affordance detection, categorization, segmentation, and high-level affordance reasoning, are specifically discussed. Furthermore, functional scene understanding and its prevalent descriptors used in the literature are covered. This survey also provides the necessary background to the problem, sheds light on its significance, and highlights the existing challenges for affordance and functionality learning.