This paper proposes a neural networks predistorter based on the bidirectional long-short-term memory (BiLSTM) structure. The proposed predistorter was trained while ensuring that it captures the full intrinsic behavior of the device under test including its memory effects and nonlinear distortions. For this purpose, the device under test was characterized while operating at peak power level with a test signal that emulates strong memory effects. Extensive experimental validation carried on a commercial Gallium Nitride power amplifier prototype demonstrated the ability of the proposed predistorter to maintain standard compliant adjacent channel leakage ratio over a wide range of operating conditions including operating average power, signal bandwidth, and carriers' configurations. It has been shown that a digital predistorter (DPD) derived from one single training condition was able to linearize the device under test for 72 different test conditions with signal bandwidths between 10MHz and 40MHz, and an operating power range of 5dB. Furthermore, benchmarking results showed that the BiLSTM DPD is unable to maintain satisfactory performance when trained with a sub-optimal signal which does not emulate the full behavior of the device under test. Moreover, it has been shown that the use of the optimal characterization signal along with a generalized memory polynomial predistorter does not lead to satisfactory performance. Hence, the resilience of the predistorter is obtained by combining the suitable model structure along with the appropriate training approach. Such resilient DPD presents a paradigm shift in predistortion techniques which significantly minimizes the need for update. It is anticipated that this work will pave the road for a new generation of DPDs resilient to a wide range of operating conditions.