In recent years, bone fracture detection and classification has been a widely discussed topic and many researchers have proposed different methods to tackle this problem. Despite this, a universal approach able to classify all the fractures in the human body has not yet been defined. We aim to analyze and evaluate a selection of papers, chosen according to their representative approach, where the authors applied different deep learning techniques to classify bone fractures, in order to select the strengths of each of them and try to delineate a generalized strategy. Each study is summarized and evaluated using a radar graph with six values: area under the curve (AUC), test accuracy, sensitivity, specificity, dataset size and labelling reliability. Plus, we defined the key points which should be taken into account when trying to accomplish this purpose and we compared each study with our baseline. In recent years, deep learning and, in particular, the convolution neural network (CNN), has achieved results comparable to those of humans in bone fracture classification. Adopting a correct generalization, we are reasonably sure that a computer-aided diagnosis (CAD) system, correctly designed to assist doctors, would save a considerable amount of time and would limit the number of wrong diagnoses.