Modern aviation security systems are largely tied to the work of screening operators. Due to physical characteristics, they are prone to problems such as fatigue, loss of attention, etc. There are methods for recognizing such objects, but they face such difficulties as the specific structure of luggage X-ray images. Furthermore, such systems require significant computational resources when increasing the size of models. Overcoming the first and second disadvantage can largely lie in the hardware plane. It needs new introscopes and registration techniques, as well as more powerful computing devices. However, for processing, it is more preferable to improve quality without increasing the computational power requirements of the recognition system. This can be achieved on traditional neural network architectures, but with the more complex training process. A new training approach is proposed in this study. New ways of baggage X-ray image augmentation and advanced approaches to training convolutional neural networks and vision transformer networks are proposed. It is shown that the use of ArcFace loss function for the task of the items binary classification into forbidden and allowed classes provides a gain of about 3–5% for different architectures. At the same time, the use of softmax activation function with temperature allows one to obtain more flexible estimates of the probability of belonging, which, when the threshold is set, allows one to significantly increase the accuracy of recognition of forbidden items, and when it is reduced, provides high recall of recognition. The developed augmentations based on doubly stochastic image models allow one to increase the recall of recognizing dangerous items by 1–2%. On the basis of the developed classifier, the YOLO detector was modified and the mAP gain of 0.72% was obtained. Thus, the research results are matched to the goal of increasing efficiency in X-ray baggage image processing.