Context. The problem of image classification algorithms vulnerability to destructive perturbations has not yet been definitively resolved and is quite relevant for safety-critical applications. Therefore, object of research is the process of training and inference for image classifier that functioning under influences of destructive perturbations. The subjects of the research are model architecture and training algorithm of image classifier that provide resilience to adversarial attacks, fault injection attacks and concept drift.
Objective. Stated research goal is to develop effective model architecture and training algorithm that provide resilience to adversarial attacks, fault injections and concept drift.
Method. New training algorithm which combines self-knowledge distillation, information measure maximization, class distribution compactness and interclass gap maximization, data compression based on discretization of feature representation and semi-supervised learning based on consistency regularization is proposed.
Results. The model architecture and training algorithm of image classifier were developed. The obtained classifier was tested on the Cifar10 dataset to evaluate its resilience over an interval of 200 mini-batches with a training and test size of mini-batch equals to 128 examples for such perturbations: adversarial black-box L∞-attacks with perturbation levels equal to 1, 3, 5 and 10; inversion of one randomly selected bit in a tensor for 10%, 30%, 50% and 60% randomly selected tensors; addition of one new class; real concept drift between a pair of classes. The effect of the feature space dimensionality on the value of the information criterion of the model performance without perturbations and the value of the integral metric of resilience during the exposure to perturbations is considered.
Conclusions. The proposed model architecture and learning algorithm provide absorption of part of the disturbing influence, graceful degradation due to hierarchical classes and adaptive computation, and fast adaptation on a limited amount of labeled data. It is shown that adaptive computation saves up to 40% of resources due to early decision-making in the lower sections of the model, but perturbing influence leads to slowing down, which can be considered as graceful degradation. A multi-section structure trained using knowledge self-distillation principles has been shown to provide more than 5% improvement in the value of the integral mectric of resilience compared to an architecture where the decision is made on the last layer of the model. It is observed that the dimensionality of the feature space noticeably affects the resilience to adversarial attacks and can be chosen as a tradeoff between resilience to perturbations and efficiency without perturbations.