Power system dynamic security assessment (DSA) has long been essential for protecting the system from the risk of cascading failures and wide‐spread blackouts. The machine learning (ML) based data‐driven strategy is promising due to its real‐time computation speed and knowledge discovery capacity. However, ML algorithms are found to be vulnerable against well‐designed malicious input samples that can lead to wrong outputs. Adversarial attacks are implemented to measure the vulnerability of the trained ML models. Specifically, the targets of attacks are identified by interpretation analysis that the data features with large SHAP values will be assigned with perturbations. The proposed method has the superiority that an instance‐based DSA method is established with interpretation of the ML models, where effective adversarial attacks and its mitigation countermeasure are developed by assigning the perturbations on features with high importance. Later, these generated adversarial examples are employed for adversarial training and mitigation. The simulation results present that the model accuracy and robustness vary with the quantity of adversarial examples used, and there is not necessarily a trade‐off between these two indicators. Furthermore, the rate of successful attacks increases when a greater bound of perturbations is permitted. By this method, the correlation between model accuracy and robustness can be clearly stated, which will provide considerable assistance in decision making.