As the core technology of artificial intelligence, salient object detection (SOD) is an important approach to improve the analysis efficiency of remote sensing images by intelligently identifying key areas in images. However, existing methods that rely on a single strategy, convolution or Transformer, exhibit certain limitations in complex remote sensing scenarios. Therefore, we developed a Dual-Stream Feature Collaboration Perception Network (DCPNet) to enable the collaborative work and feature complementation of Transformer and CNN. First, we adopted a dual-branch feature extractor with strong local bias and long-range dependence characteristics to perform multi-scale feature extraction from remote sensing images. Then, we presented a Multi-path Complementary-aware Interaction Module (MCIM) to refine and fuse the feature representations of salient targets from the global and local branches, achieving fine-grained fusion and interactive alignment of dual-branch features. Finally, we proposed a Feature Weighting Balance Module (FWBM) to balance global and local features, preventing the model from overemphasizing global information at the expense of local details or from inadequately mining global cues due to excessive focus on local information. Extensive experiments on the EORSSD and ORSSD datasets demonstrated that DCPNet outperformed the current 19 state-of-the-art methods.