Abstract-In recent years, Convolutional Neural Networks (ConvNets) have become the quintessential component of several state-of-the-art Artificial Intelligence tasks. Across the spectrum of applications, the performance needs vary significantly, from high-throughput image recognition to the very low-latency requirements of autonomous cars. In this context, FPGAs can provide a potential platform that can be optimally configured based on different performance requirements. However, with the increasing complexity of ConvNet models, the architectural design space becomes overwhelmingly large, asking for principled design flows that address the application-level needs. This paper presents a latency-driven design methodology for mapping ConvNets on FPGAs. The proposed design flow employs novel transformations over a Synchronous Dataflow-based modelling framework together with a latency-centric optimisation procedure in order to efficiently explore the design space targeting low-latency designs. Quantitative evaluation shows large improvements in latency when latency-driven optimisation is in place yielding designs that improve the latency of AlexNet by 73.54× and VGG16 by 5.61× over throughput-optimised designs.