Neuromorphic perception with event-based sensors, asynchronous hardware, and spiking neurons shows promise for real-time, energy-efficient inference in embedded systems. Brain-inspired computing aims to enable adaptation to changes at the edge with online learning. However, the parallel and distributed architectures of neuromorphic hardware based on co-localized compute and memory imposes locality constraints to the on-chip learning rules. We propose the Event-based Three-factor Local Plasticity (ETLP) rule that uses the pre-synaptic spike trace, the post-synaptic membrane voltage and a third factor in the form of projected labels with no error calculation, that also serve as update triggers. ETLP is applied to visual and auditory event-based pattern recognition using feedforward and recurrent spiking neural networks. Compared to Back-Propagation Through Time (BPTT), eProp and DECOLLE, ETLP achieves competitive accuracy with lower computational complexity. We also show that when using local plasticity, threshold adaptation in spiking neurons and a recurrent topology are necessary to learn spatio-temporal patterns with a rich temporal structure. Finally, we provide a proof of concept hardware implementation of ETLP on FPGA to highlight the simplicity of its computational primitives and how they can be mapped into neuromorphic hardware for online learning with real-time interaction and low energy consumption.