Transfer Learning for Jet Tagging in Particle Physics

Jet experiments at CERN

Introduction

Jet tagging is a very active area of research in particle physics. When two particles collide with high energy and momentum, this energy allows them to decay into a shower of secondary particles forming a narrow cone of particles roughly traveling along the same direction: a “particle jet”. Jet tagging describes the process of using data about the secondary particles (their masses, energies, relative directions and velocities, and many other features) to determine which fundamental particles initiated the jet. At CERN, particle jets are observed through the detector at truly unmanageable data rates: during collisions, we can expect several petabytes of data per second. It is impossible to save all of these collision events to disk, and much effort is devoted to developing efficient algorithms to tag jets in real time (or at least as fast as possible!) in order to quickly determine which collision events should be saved for further analysis. Recently, these efforts have been focused on Deep Learning, which promises to deliver quicker results without significant loss in accuracy compared to classical algorithms. This will be especially useful for when the High-Luminosity Large Hadron Collider (HL-LHC) begins operating in (estimated) 2029, since this upgrade to the current LHC will be generating data at even higher rates than what we are currently observing.

The goal of our project is to expand upon the current state-of-the-art Neural Network for jet tagging, developed in a paper titled “Jet Tagging via Particle Clouds” by Qu and Gouskos. This network performs with very high accuracy, but still doesn’t reach the speeds necessary for real-time analysis of the vast data streams produced in the LHC. In order to improve its speed without losing accuracy, we would like to use Transfer Learning to transfer the knowledge from this “teacher” model, trained on high resolution data (equivalent to the full data stream coming out of the LHC), to a “student” model trained on lower resolution data (equivalent to the reduced data stream that is actually saved to disk). The student model, if trained only on the low resolution data, should naturally perform more poorly than the teacher model trained on high resolution data. However, if the student is able to learn from the teacher as well as the low resolution data, the hope is that it should perform comparably and with a significant speed-up. The optimal student network architecture could be identical to the teacher’s or it could be entirely different, we will have to do some research and experiment!

Related Work

Our project is inspired by the paper "Jet Tagging via Particle Clouds" by Qu and Gouskos, which presents a new state-of-the-art model for jet tagging that improves upon its predecessors by representing jets as unordered "clouds" of particles, and tagging them using a Graph Neural Network. Our goal is to recreate this model and transfer its knowledge to a smaller scale model.

Data

We have the following "high resolution" data consisting of MC simulations of jet events at CERN which we will use to train the model: https://zenodo.org/records/2603256

Methodology

This project will consist of two main steps. First, we would like to implement a basic transfer-learning scenario using a linear regression model. We would train a "teacher" dense network on high resolution data, and transfer its knowledge to a smaller "student" dense network that receives lower resolution data. This is just to verify that we can implement transfer learning without issues. Next, we will recreate the state-of-the-art jet tagging via particle clouds model (this GNN will be a "teacher" model), build a smaller-scale student GNN, and implement the transfer of knowledge.

Metrics

We will define success as achieving a student model that can achieve similar tagging accuracy on the low resolution data compared to the teacher model trained on high resolution data.

Ethics

Why is Deep Learning a good approach to this problem?

Due to the shear volume and complexity of the data, traditional data analysis techniques are no longer keeping up. Deep learning gives us the chance to analyze these vast amounts of data much faster and with similar if not even better accuracy.

Who are the major “stakeholders” in this problem, and what are the consequences of mistakes made by your algorithm?

The major stakeholders in jet tagging for particle physics include physicists and researchers, engineering and technical staff, funding agencies, the international scientific community, and the general public. Errors in jet tagging algorithms can lead to significant consequences such as data loss, resource wastage, scientific inaccuracies, and potential reputational damage for research institutions. These impacts highlight the importance of rigorous development, testing, and refinement of algorithms to ensure they are reliable and efficient in handling the vast data produced by experiments like those conducted at the Large Hadron Collider.

Division of labor

Jade and Egor contributed equally to preparing the slides and writing the final report. As for the code, Egor took care of aquiring the data, preprocessing, and designing the experiments. Jade took care of writing all models and training loops as well as running the experiments on Oscar.