“…The imbalance of sample numbers in the dataset gives rise to the challenge of long-tailed visual recognition. Most previous works assume that head classes are always easier to be learned than tail classes, e.g., class re-balancing [8,14,24,34,37,46,52], information augmentation [23,31,35,38,39,44,56,64,67], decoupled training [10,16,29,30,71,76], and ensemble learning [20,36,57,58,61,72,77] have been proposed to improve the performance of tail classes. However, recent studies [3,50] have shown that classification dif- ficulty is not always correlated with the number of samples, e.g., the performance of some tail classes is even higher than that of the head classes.…”