Tag Archives: data handling

Outline the XENONnT Computing Scheme at the 2nd Rucio Community Workshop in Oslo

Oslo welcomed all 66 participants of the second Rucio Community Workshop with pleasant weather and a venue which offered an astonishing view about the capital of Norway.
The opensource and contribution model of the Rucio data management tool captures more and more attention from numerous fields. Therefore, 21 communities reported this year about the implementation of Rucio in their current data workflows, discussed with the Rucio developing team possible improvements and chatted among each other during the coffee breaks to learn from others experiences. Among the various communities were presentations given by the DUNE experiment, Belle-2 and LSST. The XENON Dark Matter Collaboration presented the computing scheme of the upcoming XENONnT experiment. Two keynote talks from Richard Hughes-Jones (University of Maryland) and Gundmund Høst (NeIC) highlighted the concepts of the upcoming generation of academic networks and the Nordic e-Infrastructure Collaboration.

After the successful XENON1T stage with two major science runs, a world-leading limit for spin-indepenent Dark Matter interactions with nucleons and further publications, the XENON1T experiment stopped data taking in December 2018. We aim for two major updates for the successor stage of XENONnT: a larger time projection chamber (TPC) which holds ~8,000 kg of liquid xenon with 496 PMTs for signal readout and an additional neutron veto detector based on Gadolinium doped water in our water tank. That requires upgrades in our current data management and processing scheme, which was presented last year at the first Rucio Community Workshop. Fundamental change is the new data processor STRAX which allows us much faster data processing. Based on the recorded raw data, the final data product will be available at distinct intermediate processing stages which depend on each other. Therefore, we stop using our “classical” data scheme of raw data, processed data and minitrees, and instead aim for a more flexible data structure. Nevertheless, all stages of the data are distributed with Rucio to connected grid computing facilities. STRAX will be able to process data from the TPC, the MuonVeto and the NeutronVeto together to allow coincident analysis.

The data flow of the XENONnT experiment

The data flow of the XENONnT experiment. A first set data is processed already at the LNGS. All data kinds are distributed with Rucio to the analysts.

Reprocessing campaigns are planed ahead with HTCondor and DAGMan jobs at EGI and OSG similar to the setup of XENON1T. Due to the faster data processor, it becomes necessary to outline a well-established read and write routine with Rucio to guarantee quick data access.
Another major update in the XENONnT computing scheme becomes the tape backup location. Because of the increased number of disks and tape allocations in the Rucio catalogue, we will abandon the Rucio independent tape backup in Stockholm and use dedicated Rucio storage elements for storing the raw data. The XENON1T experiment collected ~780 TB of (raw) data during its life time which are all managed by Rucio. The XENON Collaboration is looking forward to continuing this success story with XENONnT

XENON1T at the first Rucio Community Workshop at CERN

Everything scales up! Even the amount of acquired raw data in XENON1T. To handle data transfers easily, the XENON collaboration decided to let the Rucio Scientific Data Managment software do all the work. Rucio is developed at CERN and meant to manage scientific data. Data transfers, book keeping, easy data access and safety against data loss are its big advantage.

XENON1T is taking about one Terabyte of raw data per day. The detector is located at the Laboratori Nazionali del Gran Sass (LNGS) in Italy and the data need to be shipped out to dedicated computing centers for data reduction and analysis.

Individual Rucio clients access dedicated GRID disk space on world wide distributed computer facilities. Everything is controlled by a Rucio server which keeps track on storage locations, data sizes and transfers within the computer infrastructure. Rucio is developed in Python and its distribution becomes very simple.

The First Rucio Community Workshop was held at CERN on 1st and 2nd of March. Since Rucio was developed for the ATLAS collaboration, other experiments like XENON and AMS started to use Rucio a while ago. Nowadays, more collaborations such as EISCAT 3D, LIGO or NA62 (just to mention a few) became interested. The workshop allowed to meet all each other: developers and users discussed several use cases and how to improve Rucio for individual collaborations.

The XENON1T data distribution from https://indico.cern.ch/event/676472/contributions/2905755/

The XENON1T data distribution framework

We presented our integration of Rucio in the existing data handling framework. XENON1T raw data are distributed to five computing centers in Europe and the US. Each one is connected to the European Grid Interface (EGI) or the Open Science Grid (OSG) for data reduction (“processing”). Raw data are processed on the GRID and the reduced data sets are provided for the analysts on Research Computing Center (RCC) in Chicago. Beyond this, the XENON collaboration will continue to use Rucio for the upcoming XENONnT upgrade.