2019
DOI: 10.1051/epjconf/201921404010
|View full text |Cite
|
Sign up to set email alerts
|

Distributed Data Collection for the Next Generation ATLAS EventIndex Project

Abstract: The ATLAS EventIndex currently runs in production in order to build a complete catalogue of events for experiments with large amounts of data. The current approach is to index all final produced data files at CERN Tier0, and at hundreds of grid sites, with a distributed data collection architecture using Object Stores to temporarily maintain the conveyed information, with references to them sent with a Messaging System. The final backend of all the indexed data is a central Hadoop infrastructure at CERN; an Or… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 3 publications
0
4
0
Order By: Relevance
“…This work was first presented in CMMSE22, the International Conference on Computational and Mathematical Methods in Science and Engineering, and was granted the prize for "Best computational applications on line presentation" [21,22].…”
Section: Data Availabilitymentioning
confidence: 99%
“…This work was first presented in CMMSE22, the International Conference on Computational and Mathematical Methods in Science and Engineering, and was granted the prize for "Best computational applications on line presentation" [21,22].…”
Section: Data Availabilitymentioning
confidence: 99%
“…The global architecture supports the independent evolution of the system components, and indeed some of them have already been substantially improved or replaced by new implementations. The Data Collection system was progressively restructured with the replacement of the messaging system ActiveMQ [10] with the CEPH Object Store [11] as the main data transfer mechanism between Grid jobs and the CERN central servers [12]. An EventIndex Supervisor was introduced at the same time to keep track of the data transfers, the validation procedures and the storage in Hadoop.…”
Section: System Design Evolutionmentioning
confidence: 99%
“…The latest developments are aimed to optimize storage and operational resources, in order to accommodate the higher amount of data produced by ATLAS, which is expected to increase in the future with a prediction of 35 billion new real events per year in Run3, and 100 billion at the HL-LHC. At IFIC we have improved the data collection system [8], and we are currently developing new storage schemas using HBase/Apache Phoenix for the final data backend with promising results.…”
Section: Event Index Projectmentioning
confidence: 99%