NNGeometry

Inspiration

When exploring recent deep learning research papers, I found it striking that Fisher Information Matrices and Neural Tangent Kernels are used in many projects across several subdomains of deep learning, but each of these projects have their own implementation, that is often buggy, and limited to their use.

I instead think that instead of reinventing the wheel every time, there is a need for a library that would do exactly this: make it easy for researchers and practitioners to implement algorithms that use these matrices, and give them access to recent advances in approximate thereof.

What it does

NNGeometry allows to quickly define and evaluate most linear algebra operations using Fisher Information Matrices and Neural Tangent Kernels.

As a motivation, let us consider a technique of continual learning called Elastic Weight Consolidation (or EWC). In EWC, we need to compute the simple formula dwT F dw. The problem is that this simple formula turns out to be very difficult to implement in practice, for the following reasons:

F is a very large matrix, in fact it is d x d where d is the number of parameters of a neural network, up to 10^8 in recent architectures.
When writing maths I can simply write dw, but in real life this vector is a bunch of scalar parameters, split accross several layers. Similarly, F must be computed for all scalar parameters, and when computing dwT F dw we need to make sure that parameters are correctly mapped between dw and F.
In short, it is not as simple as writing torch.dot(torch.mv(F, dw), dw),