Multi-signal detection is of great significance in civil and military fields, such as cognitive radio (CR), spectrum monitoring, and signal reconnaissance, which refers to jointly detecting the presence of multiple signals in the observed frequency band, as well as estimating their carrier frequencies and bandwidths. In this work, a deep learning-based framework named SigdetNet is proposed, which takes the power spectrum as the network’s input to localize the spectral locations of the signals. In the proposed framework, Welch’s periodogram is applied to reduce the variance in the power spectral density (PSD), followed by logarithmic transformation for signal enhancement. In particular, an encoder-decoder network with the embedding pyramid pooling module is constructed, aiming to extract multi-scale features relevant to signal detection. The influence of the frequency resolution, network architecture, and loss function on the detection performance is investigated. Extensive simulations are carried out to demonstrate that the proposed multi-signal detection method can achieve better performance than the other benchmark schemes.