Abstract-The purpose of the paper was to explore the dynamic information flow among brain regions, which took part in the perception of the semantic relationship between the scene and the sound of the object within the scene. 15 healthy volunteers were recruited to observe the 4 categories (32 subclasses) of the scene pictures and listened to the 8 categories (64 kinds) of sound corresponding to each scene, and the data were collected and analyzed by functional magnetic resonance imaging. Dynamic causal modeling method embedded in spm8 to analyze the connection among brain regions involved in different experimental tasks. Six candidate models were constructed, and an optimal model was determined by Bayesian model selection. In the optimal model, the flow of information between the parahippocampal place area (PPA) and the lateral occipital complex (LO) under the adjustment of the scene condition was interactive, and the sound condition influenced the activity of the STS. The results showed that the existence of the information flow among the brain regions were selective to the object sound or the scene, which was essential for the processing of the semantic relationship between the human brain's perceived sound and the scene.