Column Swarm Reinforcement Learning (CSRL)

What it does

Column Swarm Reinforcement Learning (CSRL) is a swarm intelligence model composed of a bi-directional hierarchy of "swarmling"-minicolumns developed based on HTM technology. Each column can solve many simple tasks on its own, but together they can solve large-scale problems.

CSRL is on other words a lot of little reinforcement learners that work together to create one big reinforcement learner, where the resulting hierarchical model looks astoundingly like HTM.

How I built it

I came up with the idea of using a swarm intelligence model with HTM after seeing Jeff's 6-layer model of the neocortex. I thought of ways that the construction of such a hierarchy could be done automatically. I had previously had some success using swarms of simpler learning automata, and with the development of SDRRL (an ultra-efficient "swarmling") and its similarity to cortical minicolumns used in HTM I decided to make a fully blown hierarchy with them.

I wrote it in C++, first on the CPU. I will move it to the GPU using OpenCL soon.

Details

CSRL is composed of a lot of small reinforcement learners that work together. I wrote a tutorial for a single on of those reinforcement learners (called SDRRL, sparse distributed representation reifnorcement learning). At least, this is what it will be composed of in the future; currently it is composed of simple OLPOMDP learners.

Here is a tutorial on how to create SDRRL: link A tutorial on the full swarm will come soon.

These SDRRL units are organized into a hierarchy, where the actions of a unit are fed in as inputs to other units. Each unit optimizes locally, but the end result is that the whole swarm optimizes globally as well.

Challenges I ran into

Optimizing SDRRL. SDRRL is insanely fast be default, but I need to run potentially millions of SDRRL swarmlings at once. So I came up with several mechanisms to exploit the SDRRL sparsity to reduce computational costs.