2017
DOI: 10.1007/jhep10(2017)174
|View full text |Cite
|
Sign up to set email alerts
|

Classification without labels: learning from mixed samples in high energy physics

Abstract: Modern machine learning techniques can be used to construct powerful models for difficult collider physics problems. In many applications, however, these models are trained on imperfect simulations due to a lack of truth-level information in the data, which risks the model learning artifacts of the simulation. In this paper, we introduce the paradigm of classification without labels (CWoLa) in which a classifier is trained to distinguish statistical mixtures of classes, which are common in collider physics. Cr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
202
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 205 publications
(203 citation statements)
references
References 88 publications
1
202
0
Order By: Relevance
“…Part of these approaches use machine learning techniques, which is another direction into which new physics searches at the LHC can expand, as has been also proposed recently in several other contexts (e.g., refs. [34][35][36][37][38][39][40][41][42][43]).…”
Section: Discussionmentioning
confidence: 99%
“…Part of these approaches use machine learning techniques, which is another direction into which new physics searches at the LHC can expand, as has been also proposed recently in several other contexts (e.g., refs. [34][35][36][37][38][39][40][41][42][43]).…”
Section: Discussionmentioning
confidence: 99%
“…For example, this was used in the γγ channel of the recent tth observation, [19,20] and General Search [21][22][23] strategies are from CMS and ATLAS, respectively. LDA stands for Latent Dirichlet Allocation [37,78], ANOmaly detection with Density Estimation (ANODE) is the method presented in this paper, and CWoLa stands for Classification Without Labels [32,33,77]. Direct density estimation is a form of side-banding where the multidimensional feature space density is learned conditional on the resonant feature (see Sec.…”
Section: Bsm Sensitivitymentioning
confidence: 99%
“…Instead, CWoLa hunting uses neural networks to identify differences between signal regions and neighboring sideband regions. By turning the problem into a supervised learning task [77], CWoLa is able to effectively find rare resonant signals. However, CWoLa hunting has certain requirements on the independence of the discriminating features and the resonant feature.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…6 A large effort is made to address this aspect of phenomenology thanks to new data analysis techniques. For example, new interesting results are obtained thanks to the machine learning approach [39,40]. Figure 5 shows the spin weight histograms for the H and X samples.…”
Section: Spin Dependent Characteristicsmentioning
confidence: 99%