In this article, a new behavioral modeling method based on augmented long‐short term memory (ALSTM) networks for ultra‐broadband millimeter‐wave power amplifiers (mmWave PAs), is proposed. The basic theory and modeling procedure of this technique are provided. Different optimization cores of the models are tested, and the optimal algorithm is chosen. Tradeoffs have been made to fix the optimal topology for the proposed modeling technique. To validate the proposed method, a 4‐carrier 320 MHz modulated signal was employed to excite a mmWave PA with the center frequency located at 41 GHz. Experimental results show that the proposed ALSTM model has better predictive capability when compared with the traditional and other existing machine learning modeling techniques.