With the advent of the multimedia era, the identification of sensitive information in social data of online social network users has become critical for maintaining the security of network community information. Currently, traditional sensitive information identification techniques in online social networks cannot acquire the full semantic knowledge of multimodal data and cannot learn cross-information between data modalities. Therefore, it is urgent to study a new multimodal deep learning model that considers semantic relationships. This paper presents an improved multimodal dual-channel reasoning mechanism (MDR), which deeply mines semantic information and implicit association relationships between modalities based on the consideration of multimodal data fusion. In addition, we propose a multimodal adaptive spatial attention mechanism (MAA) to improve the accuracy and flexibility of the decoder. We manually annotated real social data of 50 users to train and test our model. The experimental results show that the proposed method significantly outperforms simple multimodal fusion deep learning models in terms of sensitive information prediction accuracy and adaptability and verifies the feasibility and effectiveness of a multimodal deep model considering semantic strategies in social network sensitive information identification.