“…The case of Hamming balls {0, 1} n ≤k consisting of all vectors with at most k 1s received some attention. Long and Servedio [21] gave bounds for the weights of PTFs for degree d = 1. Their main motivation to study this setting comes from learning theory: in scenarios involving learning a categorical data the common representation for examples in the one-hot encoded vector, which might have an extremely large amount of features, but only a small faction of them can be active at the same time.…”