Using the Horizon-AGN hydrodynamical simulation and self-organising maps (SOMs), we show how to compress the complex, high-dimensional data structure of a simulation into a 2-d grid, which greatly facilitates the analysis of how galaxy observables are connected to intrinsic properties. We first verify the tight correlation between the observed 0.3−5µm broad-band colours of Horizon-AGN galaxies and their high-resolution spectra. The correlation is found to extend to physical properties such as redshift, stellar mass, and star formation rate (SFR). This direct mapping from colour to physical parameter space is shown to work also after including photometric uncertainties that mimic the COSMOS survey. We then label the SOM grid with a simulated calibration sample, and estimate redshift and SFR for COSMOS-like galaxies up to z ∼ 3. In comparison to state-of-the-art techniques based on synthetic templates, our method is comparable in performance but less biased at estimating redshifts, and significantly better at predicting SFRs. In particular our "data-driven" approach, in contrast to model libraries, intrinsically allows for the complexity of galaxy formation and can handle sample biases. We advocate that obtaining the calibration for this method should be one of the goals of next-generation galaxy surveys.