The detection of earthquake signals is a fundamental yet challenging task in observational seismology. A robust automatic earthquake detection algorithm is strongly demanded in view of the ever‐growing global seismic dataset. Here, we develop an automatic earthquake detection framework based on a deep learning approach (SCALODEEP). It extracts high‐order features embedded in three‐component seismograms by encoding a time‐frequency representation of the data (scalogram) into a deep network with skip connections. The SCALODEEP is trained and validated on an open‐source dataset from North California, and then employed to seismicity detection in four areas, including Arkansas, Japan, Texas, and Egypt. Despite vastly varying characteristics of regional earthquakes (e.g., focal mechanism, duration, and noise level), SCALODEEP successfully detects seismic signals over a broad range of local magnitudes (as low as −1.3normalML) and outperforms conventional algorithms such as STA/LTA, FAST, and template matching. Compared to recently proposed deep learning based frameworks (e.g., CRED and Earthquake transformer), SCALODEEP achieves a superior generalization ability via a sophisticated network architecture. In summary, our study offers a promising new tool to improve existing earthquake detection systems and, as importantly, sheds light on designing an effective deep learning network for generalized earthquake detection.