Detection and identification of pathogenic bacteria isolated from biological samples (blood, urine, sputum, etc.) are crucial steps in accelerated clinical diagnosis. However, accurate and rapid identification remain difficult to achieve due to the challenge of having to analyse complex and large samples. Current solutions (mass spectrometry, automated biochemical testing, etc.) propose a trade-off between time and accuracy, achieving satisfactory results at the expense of time-consuming processes, which can also be intrusive, destructive and costly. Moreover, those techniques tend to require an overnight subculture on solid agar medium delaying bacteria identification by 12–48 hours, thus preventing rapid prescription of appropriate treatment as it hinders antibiotic susceptibility testing. In this study, lens-free imaging is presented as a possible solution to achieve a quick and accurate wide range, non-destructive, label-free pathogenic bacteria detection and identification in real-time using micro colonies (10–500 μm) kinetic growth pattern combined with a two-stage deep learning architecture. Bacterial colonies growth time-lapses were acquired thanks to a live-cell lens-free imaging system and a thin-layer agar media made of 20 μl BHI (Brain Heart Infusion) to train our deep learning networks. Our architecture proposal achieved interesting results on a dataset constituted of seven different pathogenic bacteria—Staphylococcus aureus (S. aureus), Enterococcus faecium (E. faecium), Enterococcus faecalis (E. faecalis), Staphylococcus epidermidis (S. epidermidis), Streptococcus pneumoniae R6 (S. pneumoniae), Streptococcus pyogenes (S. pyogenes), Lactococcus Lactis (L. Lactis). At T = 8h, our detection network reached an average 96.0% detection rate while our classification network precision and sensitivity averaged around 93.1% and 94.0% respectively, both were tested on 1908 colonies. Our classification network even obtained a perfect score for E. faecalis (60 colonies) and very high score for S. epidermidis at 99.7% (647 colonies). Our method achieved those results thanks to a novel technique coupling convolutional and recurrent neural networks together to extract spatio-temporal patterns from unreconstructed lens-free microscopy time-lapses.