Mammalian visual systems are characterized by their ability to recognize stimuli invariant to various transformations. Here, we investigate the hypothesis that this ability is achieved by the temporal encoding of visual stimuli. By using a model of a cortical network, we show that this encoding is invariant to several transformations and robust with respect to stimulus variability. Furthermore, we show that the proposed model provides a rapid encoding, in accordance with recent physiological results. Taking into account properties of primary visual cortex, the application of the encoding scheme to an enhanced network demonstrates favorable scaling and high performance in a task humans excel at. M ammals demonstrate highly evolved visual object recognition skills, tolerating considerable changes in images caused by, for instance, different viewing angles and deformations. Elucidating the mechanisms of such invariant pattern recognition is an active field of research in neuroscience (1-5). However, very little is known about the underlying algorithms and mechanisms. A number of models have been proposed which aim to reproduce capabilities of the biological visual system, such as invariance to shifts in position, rotation, and scaling (6-8). Most of these models are based on the ''Neocognitron'' (9), a hierarchical multilayer network of spatial feature detectors. As a result of a gradual increase of receptive field sizes, translation-invariant representations emerge in the form of activity patterns at the highest level. These models do not consider time as a coding dimension for neural representations. Recently, however, the importance of the temporal dynamics of neuronal activity in representing visual stimuli has gained increased attention (10, 11). Hence, it seems timely to consider the role of temporal coding in the context of tasks like invariant object recognition. In recent years, several modeling studies have addressed properties of temporal codes (12)(13)(14). For instance, Buonomano and Merzenich (15) proposed a model for positioninvariant pattern recognition which uses temporal coding. In this model, feed-forward inhibition modulates the spike-timing such that stimuli are represented by the response latencies of the neurons in the network. This architecture naturally leads to translation invariant representations. This model assigns a critical role to inhibitory interactions in the feed-forward path of the visual system (retina-LGN-V1), whereas anatomical studies suggest that these connections are predominantly excitatory (16). Furthermore, the majority of inputs to cortical neurons are excitatory and of cortical origin (17). Indeed, a recent theoretical study has shown that lateral excitatory coupling has pronounced effects on the global network dynamics (18). In particular, the combination of intracortical connectivity and dendritic processing allowed context-dependent representations of different stimuli to be expressed in the temporal dynamics of the network. Here, we build on these previous proposal...