Distinguishing boosted objects in hadronic final states requires a combined understanding of robust distributions for signal and background events. Substructure based approaches can isolate signal events in hadronic channels but tend to distort defining features of the background to be more signal-like (such as a smoothly falling invariant mass distribution). Getting the most out of experimental efforts needs a balance between the two competing effects of signal identification and background distortion. In this work, we perform a systematic study of various jet tagging methods that aim for this balance. We explore both single variable and multivariate approaches. The methods preserve the shape of the background distribution by either augmenting the training procedure or the data itself. Multiple quantitative metrics to compare the methods are considered, for tagging 2-, 3-, or 4-prong jets from the QCD background. This is the first study to show that the data augmentation techniques of Planing and PCA based scaling deliver similar performance as the augmented training techniques of Adversarial NNs and uBoost, but are both easier to implement and computationally cheaper.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.