At present, rescue firefighting relies mainly on firefighting robots, and robots with perception and decision‐making functions are the key elements for achieving intelligent firefighting. However, traditional firefighting robots often lack the ability for autonomous perception and decision‐making when extinguishing multiple fire sources, leading to low rescue efficiency and increased risk for rescue personnel, especially when making firefighting decisions in extreme fire scenes, which poses a challenge. To effectively handle firefighting tasks and ensure operational efficiency, a robot firefighting decision‐making method based on drone visual guidance is proposed. First, we introduce a novel Attention and Scale U‐Net (ASUNet) model to accurately capture crucial target information, including fire location and size, in a fire scene. The ASUNet model adopts an effective multiscale fusion strategy and attention mechanism to enhance the model's performance. Subsequently, based on the results of the ASUNet model, through pixel segmentation clustering and a genetic optimization algorithm, we obtain the robot's firefighting decision results, thereby guiding the robot to carry out firefighting operations systematically. Finally, through numerical experiments, it is verified that the proposed ASUNet model is superior and effective, as the model can perceive important information in a fire scene and extract it well. The use of improved genetic optimization can further accelerate algorithm convergence. To our knowledge, this study is the first to use drone‐based monocular vision guidance for firefighting decision‐making, providing significant engineering value.