Inspiration

The inspiration behind this project stemmed from the burgeoning field of Quantum Machine Learning (QML) and its potential applications in High Energy Physics (HEP). Specifically, the project aimed to explore the efficacy of Quantum Convolutional Neural Networks (QCNNs) in discriminating between different types of particles, namely electrons and photons, which is a crucial task in HEP research. The goal was to leverage quantum computing principles to enhance classification capabilities for particle physics datasets.

About the project

This project is a detailed implementation and training of a Quantum Convolutional Neural Network (QCNN) for classifying electron and photon datasets.

How it was built

The development of the Quantum CNN involved several key stages:

  1. Environment Setup and Dependencies: The project requires specific software versions, including TensorFlow 2.4.1, TensorFlow Quantum 0.5.1, Cirq 0.11.0, Sympy 1.5, and Numpy 1.19.5. It was run on an environment with a GPU accelerator (Tesla P100) and a high-RAM runtime (27.3 GB of available RAM).
  2. Dataset Loading and Preprocessing: ◦ Electron and photon datasets were loaded from HDF5 files (photon.hdf5 and electron.hdf5). Each dataset initially contained 249,000 samples, each a 32x32 pixel image with 2 channels. ◦ A sample size of 10,000 images was selected from the combined electron and photon datasets. ◦ The 32x32 images were then cropped to an 8x8 size, ensuring the maximum value pixel was maintained at the centre. This reduced the input image dimensions for the QCNN. ◦ A train-test split was applied, allocating 8,500 samples for training and 1,500 samples for testing (a 15% test size). The labels were converted to categorical format. ◦ Graph-Convolution Preprocessing was also explored, involving the creation of an adjacency matrix and visualising its effect on average images.
  3. Building the Quantum Neural Network: ◦ The core quantum circuit components were defined: ▪ one_qubit_rotation: Applies X, Y, and Z rotations to a single qubit, parameterized by symbols. ▪ entangling_layer: Implements a layer of Controlled-Z (CZ) entangling gates arranged in a circular topology for a given set of qubits. ◦ The generate_circuit function constructed a data re-uploading circuit composed of multiple layers. Each layer included an entangling layer and one-qubit rotations. Input features were encoded as Ry and Rz rotations. ◦ A custom Keras layer, ConvKernelPQC, was implemented to serve as the quantum convolutional kernel. This layer contains the quantum circuit and defines the trainable parameters (thetas) initialized randomly between $-\pi/2$ and $\pi/2$. It takes flattened input images, transforms them using arctan(x) and arctan(x^2), and feeds them into the quantum circuit. The output is obtained by measuring the Z observable of the first qubit. ◦ The QConv_layer function was designed to apply the ConvKernelPQC across the input image in a convolutional manner, producing an output feature map. ◦ The overall QCNN model architecture consists of two QConv_layer instances (with filter sizes 3 and 2, and stride 1), followed by classical Dense layers (8 neurons with ReLU activation, and a final 2-neuron output layer with softmax activation for classification). The model takes a (None, 8, 8) input tensor.
  4. Training and Evaluation: ◦ The model was compiled using categorical_crossentropy as the loss function, the Adam optimizer, and accuracy and AUC (Area Under the Curve) as metrics. ◦ Training was set for 200 epochs with a batch size of 128. ◦ A learning rate schedule was implemented, which dynamically reduces the learning rate at epochs 80, 120, 160, and 180 (e.g., from (10^{-3}) to (10^{-4}) after epoch 80, and further reductions thereafter). ◦ Callbacks were used for learning rate adjustment (ReduceLROnPlateau and LearningRateScheduler) and for saving model checkpoints after each epoch. What was learned Through this project, valuable insights were gained into the practical implementation and performance of QCNNs: • QCNN Architecture and Hybrid Models: It demonstrated how to integrate quantum circuits, built with Cirq and TensorFlow Quantum, into a classical Keras model to form a hybrid quantum-classical neural network. This provided hands-on experience in defining PQC layers and composing them into a larger model. • Quantum Circuit Design: Understanding the role of one-qubit rotations (Rx, Ry, Rz gates) and entangling layers (CZ gates) in building a trainable quantum circuit for data processing was a key learning. The concept of data re-uploading for encoding input features into the quantum circuit was also implemented. • Data Encoding: The method of transforming classical input data (pixel values) into quantum interpretable forms (e.g., arctan(x) and arctan(x^2)) to be fed into the quantum circuit layers was a specific technique learned. • Performance Characteristics: The training logs provided empirical data on the model's convergence and performance. The model's accuracy and AUC generally improved over the 200 epochs, albeit gradually. For instance, the validation accuracy increased from approximately 50-55% initially to around 62.7% by epoch 200, with the AUC increasing from around 0.52 to 0.654. This highlights that QCNNs are capable of learning patterns in complex datasets, though achieving high accuracy might require extensive training and hyperparameter tuning. • Computational Demands: The project underscored the significant computational resources required for training QCNNs. Each epoch took approximately 8-11 seconds on the specified GPU hardware, indicating that quantum machine learning models can be computationally intensive and benefit greatly from GPU acceleration. Challenges faced • Computational Time: Training the QCNN was time-consuming. Each epoch typically took 8-11 seconds, meaning the entire 200-epoch training process lasted for hours. This highlights a common challenge in quantum machine learning, where simulations of quantum circuits can be computationally expensive. • Resource Management: Ensuring the runtime had sufficient GPU acceleration and high-RAM was crucial for the project's execution. Incorrect setup could lead to significant performance degradation or inability to run the code. • Model Convergence: While the model showed improvement, the accuracy plateaued around 62-63%. Achieving higher accuracy might require more advanced quantum circuit designs, different data encoding strategies, or further hyperparameter optimization, which could present additional challenges in terms of model complexity and training stability. • Debugging Quantum Circuits: Debugging issues within quantum circuits can be complex due to the abstract nature of quantum operations and the interaction between classical and quantum layers.

Built With

  • python
  • sympy
  • tensorflow
  • tensorflow-quantum
Share this project:

Updates