In this work, we propose a novel framework for camouflaged object detection (COD), named D 2 C-Net, which contains two new modules: dual-branch features extraction (DFE) and gradually refined cross fusion (GRCF). Specifically, the DFE simulates the two-stage detection process of human visual mechanisms in observing camouflage scenes. For the first stage, a dense concatenation is employed to aggregate multi-level features and expand the receptive field. The first stage feature maps are then utilized to extract two-direction guidance information, which benefits the second stage. The GRCF consists of a selfrefine attention unit and a cross-refinement unit, with the aim of combining the peer layer features and DFE features for an improved COD performance. The proposed framework outperforms 13 state-of-the-art deep learning-based methods upon three public datasets in terms of five widely used metrics. Finally, we show evidence for the successful applications of the proposed method in the fields of surface defect detection, medical image segmentation.