In the current intelligent manufacturing scenario, single-mode perception limits the accuracy and safety of the human-robot collaboration process, and cannot adapt to the assembly operation requirements of robots in complex industrial environments. To solve these problems, this paper proposes a Human-Robot Collaboration (HRC) framework for a dual-robot intelligent assembly system based on multimodal perception. This framework aims to provide multimodal information interaction for robot control through gesture perception, speech perception, human body perception, and visual perception. To verify the effectiveness of this framework, it was applied to the assembly scene of the integrally shrouded blade-rotor system. Experiments show that the multimodal perception dual-robot intelligent assembly system can better realize the HRC assembly of the integrally shrouded blade and improve the intelligence of the assembly process. It shows that the framework can promote robots to flexibly and accurately perceive information among people, robots, and objects to complete intelligent assembly. The multimodal information interaction effectively improves the operational robustness of the dual-robot intelligent assembly system in complex industrial environments and complex tasks.