Recent studies on surveillance systems have employed various sensors to recognize and understand outdoor environments. In a complex outdoor environment, useful sensor data obtained under all weather conditions, during the night and day, can be utilized for application to robots in a real environment. Autonomous surveillance systems require a sensor system that can acquire various types of sensor data and can be easily mounted on fixed and mobile agents. In this study, we propose a method for modularizing multiple vision and sound sensors into one system, extracting data synchronized with 3D LiDAR sensors, and matching them to obtain data from various outdoor environments. The proposed multimodal sensor module can acquire six types of images: RGB, thermal, night vision, depth, fast RGB, and IR. Using the proposed module with a 3D LiDAR sensor, multimodal sensor data were obtained from fixed and mobile agents and tested for more than four years. To further prove its usefulness, this module was used as a monitoring system for six months to monitor anomalies occurring at a given site. In the future, we expect that the data obtained from multimodal sensor systems can be used for various applications in outdoor environments.