Inspiration
We know that a larger network can easily fit the dataset than the smaller network because more the number of network connections, the better the chance of finding an optimal solution. But this larger networks are not suitable for deployment in devices with low compute resources. So we can create a smaller network by trimming the larger network and train it using knowledge distillation techniques with larger network as teacher and smaller network as student.
What it does
Input: Trained Classification Network on a dataset, network size reduction factor
Output : Faster, lighter version of the input network with comparable performance that of Input network
How we built it
The end-to-end solution is built on PyTorch. We implemented a network trimmer module for creating faster, lighter version of the input network. We implemented a custom forward method that can automatically update the forward of the networks for extracting the intermediate layer outputs for knowledge distillation. For knowledge distillation using the SemCKD.
Challenges we ran into
Automatically trimming the network .
Automatically updating the forward of networks to extract intermediate layer outputs for knowledge distillation.
Returning complete architecture information to the user rather than just returning state_dict (we used torchscript to solve this problem)
Accomplishments that we're proud of
Network trimming module.
Network forward update module.
What we learned
Use of PyTorch hooks.
Knowledge Distillation techniques.
What's next for Breed-Net
More model trimming strategies for a specific requirement such as reducing latency, reducing disk space, reducing the GPU memory usage of the model at a moment.
Use Pytorch fx module to give more network related information to the user such as printing the network forward python code.
More use-cases such as detection, segmentation etc.
Log in or sign up for Devpost to join the conversation.