Photoacoustic (PA) imaging is hybrid imaging modality with good optical contrast and spatial resolution. Portable, cost‐effective, smaller footprint light emitting diodes (LEDs) are rapidly becoming important PA optical sources. However, the key challenge faced by the LED‐based systems is the low light fluence that is generally compensated by high frame averaging, consequently reducing acquisition frame‐rate. In this study, we present a simple deep learning U‐Net framework that enhances the signal‐to‐noise ratio (SNR) and contrast of PA image obtained by averaging low number of frames. The SNR increased by approximately four‐fold for both in‐class in vitro phantoms (4.39 ± 2.55) and out‐of‐class in vivo models (4.27 ± 0.87). We also demonstrate the noise invariancy of the network and discuss the downsides (blurry outcome and failure to reduce the salt & pepper noise). Overall, the developed U‐Net framework can provide a real‐time image enhancement platform for clinically translatable low‐cost and low‐energy light source‐based PA imaging systems.