2019
DOI: 10.1051/epjconf/201921403006
|View full text |Cite
|
Sign up to set email alerts
|

Improving efficiency of analysis jobs in CMS

Abstract: Hundreds of physicists analyze data collected by the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider using the CMS Remote Analysis Builder and the CMS global pool to exploit the resources of the Worldwide LHC Computing Grid. Efficient use of such an extensive and expensive resource is crucial. At the same time, the CMS collaboration is committed to minimizing time to insight for every scientist, by pushing for fewer possible access restrictions to the full data sample and supports the free … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 8 publications
0
3
0
Order By: Relevance
“…In studies we found that 99% of centralized production jobs finish within their estimated wall clock time, while analysis jobs, which are subject to much more uncertainty due to user code contributions, were found to finish within three hours of their estimated time with 99% confidence. This 3 hour uncertainty in analysis job run time estimation came after many improvements during the year [7]. These improvements allowed the SI group to shorten the retirement time to 4h from 10h, improving scheduling efficiency by several percent, and allowing >99% of jobs to complete within the pilot lifetime.…”
Section: Pilot Improvementsmentioning
confidence: 99%
“…In studies we found that 99% of centralized production jobs finish within their estimated wall clock time, while analysis jobs, which are subject to much more uncertainty due to user code contributions, were found to finish within three hours of their estimated time with 99% confidence. This 3 hour uncertainty in analysis job run time estimation came after many improvements during the year [7]. These improvements allowed the SI group to shorten the retirement time to 4h from 10h, improving scheduling efficiency by several percent, and allowing >99% of jobs to complete within the pilot lifetime.…”
Section: Pilot Improvementsmentioning
confidence: 99%
“…The workload management system executes payloads in compute nodes provisioned through GlideinWMS [3] and made available as execution slots in a Vanilla Universe HTCondor pool [4]. HTCondor jobs are submitted via specific workload management tools: WMAgent for central data processing and Monte Carlo production jobs, and CRAB for user jobs [5]. The data management system is modular and includes several components: PhEDEx, the data transfer and location system; DBS, the Data Bookkeeping Service, a metadata catalog; and DAS, the Data Aggregation Service designed to aggregate views and provide them to users and services [6].…”
Section: Introductionmentioning
confidence: 99%
“…The workload management system executes payloads in compute nodes provisioned through GlideinWMS [3] and thus made available as execution slots in a Vanilla Universe HTCondor [4]. HTCondor jobs are submitted via specific workload management tools: WMAgent for central data processing and Monte Carlo production jobs, and CRAB for user jobs [5]. The data management system is modular.…”
Section: Introductionmentioning
confidence: 99%