LHCBIGDATA | Exploiting big data and machine learning techniques for LHC experiments

Summary
Large international scientific collaborations will face in the near future unprecedented computing and data challenges. The analysis of multi-PetaByte datasets at CMS, ATLAS, LHCb and Alice, the four experiments at the Large Hadron Collider (LHC), requires a global federated infrastructure of distributed computing resources. The HL-LHC, the High Luminosity upgrade of the LHC, is expected to deliver 100 times more data than the LHC, with corresponding increase of event sizes, volumes and complexity. Modern techniques for big data analytics and machine learning (ML) are needed to cope with such unprecedented data stream. Critical areas that will strongly benefit from ML are data analysis, detector operation including calibration and monitoring, and computing operations. Aim of this project is to provide the LHC community with the necessary tools to deploy ML solutions through the use of open cloud technologies such as the INDIGO-DataCloud services. Heterogeneous technologies (systems based on multi-cores, GPUs, ...) and opportunistic resources will be integrated. The developed tools will be experiment-independent to promote the exchange of common solutions among the various LHC experiments. The benefits of such an approach will be demonstrated in a real world use case, the optimization of the computing operations for the CMS experiment. In addition, once available, the tools to deploy ML as a service can be easily transferred to other scientific domains that have the need to treat large data streams.
Unfold all
/
Fold all
More information & hyperlinks
Web resources: https://cordis.europa.eu/project/id/799062
Start date: 02-07-2018
End date: 01-07-2020
Total budget - Public funding: 180 277,20 Euro - 180 277,00 Euro
Cordis data

Original description

Large international scientific collaborations will face in the near future unprecedented computing and data challenges. The analysis of multi-PetaByte datasets at CMS, ATLAS, LHCb and Alice, the four experiments at the Large Hadron Collider (LHC), requires a global federated infrastructure of distributed computing resources. The HL-LHC, the High Luminosity upgrade of the LHC, is expected to deliver 100 times more data than the LHC, with corresponding increase of event sizes, volumes and complexity. Modern techniques for big data analytics and machine learning (ML) are needed to cope with such unprecedented data stream. Critical areas that will strongly benefit from ML are data analysis, detector operation including calibration and monitoring, and computing operations. Aim of this project is to provide the LHC community with the necessary tools to deploy ML solutions through the use of open cloud technologies such as the INDIGO-DataCloud services. Heterogeneous technologies (systems based on multi-cores, GPUs, ...) and opportunistic resources will be integrated. The developed tools will be experiment-independent to promote the exchange of common solutions among the various LHC experiments. The benefits of such an approach will be demonstrated in a real world use case, the optimization of the computing operations for the CMS experiment. In addition, once available, the tools to deploy ML as a service can be easily transferred to other scientific domains that have the need to treat large data streams.

Status

CLOSED

Call topic

MSCA-IF-2017

Update Date

28-04-2024
Images
No images available.
Geographical location(s)
Structured mapping
Unfold all
/
Fold all
Horizon 2020
H2020-EU.1. EXCELLENT SCIENCE
H2020-EU.1.3. EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions (MSCA)
H2020-EU.1.3.2. Nurturing excellence by means of cross-border and cross-sector mobility
H2020-MSCA-IF-2017
MSCA-IF-2017