Inspiration

We know that a larger network can easily fit the dataset than the smaller network because more the number of network connections, the better the chance of finding an optimal solution. But this larger networks are not suitable for deployment in devices with low compute resources. So we can create a smaller network by trimming the larger network and train it using knowledge distillation techniques with larger network as teacher and smaller network as student.

What it does

  1. Input: Trained Classification Network on a dataset, network size reduction factor

  2. Output : Faster, lighter version of the input network with comparable performance that of Input network

How we built it

The end-to-end solution is built on PyTorch. We implemented a network trimmer module for creating faster, lighter version of the input network. We implemented a custom forward method that can automatically update the forward of the networks for extracting the intermediate layer outputs for knowledge distillation. For knowledge distillation using the SemCKD.

Challenges we ran into

  1. Automatically trimming the network .

  2. Automatically updating the forward of networks to extract intermediate layer outputs for knowledge distillation.

  3. Returning complete architecture information to the user rather than just returning state_dict (we used torchscript to solve this problem)

Accomplishments that we're proud of

  1. Network trimming module.

  2. Network forward update module.

What we learned

  1. Use of PyTorch hooks.

  2. Knowledge Distillation techniques.

What's next for Breed-Net

  1. More model trimming strategies for a specific requirement such as reducing latency, reducing disk space, reducing the GPU memory usage of the model at a moment.

  2. Use Pytorch fx module to give more network related information to the user such as printing the network forward python code.

  3. More use-cases such as detection, segmentation etc.

Built With

Share this project:

Updates