A real-time method that automatically creates a visual memory of a scene using the growing neural gas (GNG) algorithm is described. The memory consists of a graph where nodes encode the visual information of a video stream as a limited set of representative images. GNG nodes are automatically generated and dynamically clustered. This method could be employed by robotic platforms in exploratory and rescue missions.Introduction: We propose an algorithm to automatically create a visual memory for robots equipped with cameras surveying, monitoring or searching an area of interest. We wish to encode the visual information into a limited set of representative images in real-time and with limited computational overhead. The idea behind our approach is to provide a flexible graphical representation of visual memory to be subsequently used for the semantic description of a captured scene. Our intention is to generate a graph that could be easily shared among robots or a distributed set of computational nodes and is easy to grow in cooperation.The robot's visual memories are incrementally built using a growing neural gas (GNG) algorithm. The GNG was chosen because it provides flexibility and portability, by dynamically building a graph of a video or scene video sequence. GNG was originally introduced by Fritzke [1], as an unsupervised learning technique where no prior training is needed. GNG was proved to be superior to existing unsupervised methods, such as self-organising or Kohonen maps and K-means [2,3]. In a GNG graph, nodes can be disconnected whilst the network is evolving, creating a separation between uncorrelated memories. The number of nodes need not be fixed a priori, since they are incrementally added during execution. Insertion of new nodes ceases when a set of user defined performance criteria is met, or alternatively the maximum network size is reached. The algorithm iteratively learns to identify similarities of input data and classifies them into clusters. GNG is much more than a simple clustering algorithm, it provides a means to associate visual memories and a means to build ontologies of visual concepts.