VISOR is a large connectionist system that shows how visual schemas can be learned, represented, and used through mechanisms natural to neural networks. Processing in VISOR is based on cooperation, competition, and parallel bottom-up and top-down activation of schema representations. Simulations show that VISOR is robust against noise and variations in the inputs and parameters. It can indicate the con dence of its analysis, pay attention to important minor di erences, and use context to recognize ambiguous objects. Experiments also suggest that the representation and learning are stable, and its behavior is consistent with human processes such as priming, perceptual reversal, and circular reaction in learning. The schema mechanisms of VISOR can serve as a starting point for building robust high-level vision systems, and perhaps for schema-based motor control and natural language processing systems as well.