Most existing RGB‐D salient object detection (SOD) algorithms, when using depth information for cross‐modal fusion, result in significant information redundancy due to issues with low‐quality depth information, ambiguity, and difficulty in discriminating complex scenes, ultimately leading to poor‐quality saliency maps. This article proposes a Multiple‐Attention Refinement Network (MARNet) to address the issues of insufficient cross‐modal fusion and poor quality of depth images in RGB‐D salient object detection. MARNet adopts an end‐to‐end structure and enables the fusion of cross‐modal features through multiple‐attention refinement and cross‐attention fusion with each other. This article, in particular, designs an Attention Interaction Module (AIM), which uses multiple‐attention and cross‐attention to refine and fuse the two modalities, reducing the information redundancy generated during cross‐modal interactions and background noise interference. This article designs a Multi‐Scale Compensation Module (MSCM) to guide the multi‐scale feature fusion step‐by‐step, enabling the fusion of local and global contexts of multi‐scale features. Extensive experimental results demonstrate that the MARNet in this article has significant advantages over 16 state‐of‐the‐art RGB‐D methods on five publicly available datasets. The codes can be found at https://github.com/wzxxmj/MARNet.