When in an environment with insufficient light, RGB images cannot provide clear pedestrian information, while multispectral thermal imaging can provide accurate positioning information. Therefore, multispectral object detection is more reliable and robust in the open world. Most fusion methods for existing multispectral object detection are static, that is, after the network is trained, all the modalities of each piece of data are input into the network to perform static inference. However, when the light is good enough or the light is very dark, multi-spectral input will bring unnecessary noise interference and computational redundancy. Therefore, a dynamic fusion network (EDFNet) is proposed to selectively fuse RGB and thermal data, so that the network can efficiently perform multispectral fusion and improve the accuracy of object detection. The gating function included in the fusion network is able to perform modality-level decisions based on multimodal features, enabling dynamic inference on data. The Extensive experiments on multiple datasets demonstrate that the proposed fusion method can reduce the computation costs and obtain comparable detection performance.