The question "what is on the table?" is normally simple for a human, but difficult for a machine. The problem is that the machine does not know what to search for, as no visual properties of the targets are known. Machine-vision algorithms, in general, need explicit knowledge of visual properties to perform object detection. Moreover, several visual properties must be considered to provide robustness. Such requirements make object detection computationally demanding and hence common algorithms scale poorly with respect to the number of objects and their visual properties. To address these problems a system has been developed that is inspired by findings from experimental psychology. The system is designed to search for objects on a specified place, e.g. things on a table or obstacles on a road. For such tasks many visual properties need to be processed. The presented system distributes the processing of visual properties and integrates only a relevant subset of the processed data. The relevant subset of data is found by forming object hypotheses from homogeneous regions in the scene. Hence the complexity of integrating a large set of visual properties is reduced.This thesis first provides a survey of findings from experimental psychology, which give insight into the strategies used by the human visual system. From this survey it is clear that the processing of visual data is distributed across our visual cortex. Attentional mechanisms cooperate to fuse only a relevant subset of the data. One example of such mechanisms is object formation.The presented system is also inspired by game theory, a field in which distributed computing and cooperation has been studied for quite some time. This thesis provides an overview of game theory and evaluates its applicability to visual attention.The system is evaluated in the context of a tabletop scenario; detecting objects on a table in a natural environment. The evaluation demonstrates that a sparse set of data is indeed enough for object detection when the visual context is known and the scene not too cluttered.
iv v SammanfattningFrågan "vad finns på bordet?" är vanligtvis enkel för en människa men svår för en dator. Problemet för datorn är att den inte vet vad den ska leta efter; den vet inte vilka visuella egenskaper de sökta objekten har. I allmänhet behöver datorseendealgoritmer explicit kunskap om visuella egenskaper för objektdetektion. Dessutom behövs många visuella egenskaper beaktas för att ge robusthet. Sådana krav gör algoritmer för objektdetektion beräkningsinten-siva och begränsar deras skalbarhet med avseende på antal objekt och deras visuella egenskaper. Ett system som angriper dessa problem har utvecklats. Systemet är inspirerat av experimentell psykologi och är ämnat att söka efter objekt på ett specificerat ställe; t ex saker på ett bord eller hinder på en väg. Beräkningarna av visuella egenskaper är distribuerade och endast en relevant delmängd av data integreras. Den relevanta delmängden identifieras genom att forma objekthypoteser från homogena ytor i ...