Orbital bones are composed of thick cortical bone with high intensity values and thin bones with low intensity values, making consistent segmentation difficult. In addition, the medial wall and the orbital floor are composed of thin bones, making it difficult to distinguish the intensity values from surrounding tissues due to the partial volume effect. In this paper, we propose MSDA-Net to improve segmentation performance by considering the anatomical structure of orbital bones with various thickness and the characteristics of thin bones with low intensity values and small areas in facial CT images.By applying a multi-scale module and a dual attention module that performs channel and spatial attention sequentially to the skip connection of U-Net, a feature map emphasizing the features to be paid attention to is delivered to the decoder.The paper presents the results of experiments that evaluate the effect of the multi-scale hierarchical module, single attention, and dual attention on segmentation performance. When using the proposed method, the Dice similarity coefficient (DSC) of the global and regional evaluation regions shows excellent performance with 92%, 86%, and 87%, respectively.