Inspiration
I did a previous project where I trained an arbitrageur to trade in a Constant Product Market (CPM) and in a reference market (can be a CEX, LOB...) in order to get advantage of the dis-adjustment between the price in the CPM and the price in the CEX. We saw that, when the arbitraguer uses an optimal strategy, they can always beat he CPM with constant fee, and even more if the price of the asset in the reference market has a high volatility. See here for more details.
In this project I am taking the opposite perspective. I train the Liquidity Provider to set a variable fee in order to protect herself against an intelligent arbitrageur. The arbitrageur is supposed to be intelligent, and always does the trade to maximise their profit. Under some simplified assumptions, this trade has a closed form, that can be found here
What it does
I use Deep Learning and Reinforcement Learning techniques to train an agent (the LP) with to find an optimal variable fee that depends on the available data: reserves in the CPM pool, price in a reference market, volatility of the asset, ...
How we built it
My code is just a prototype. I have coded an environment where the trading rules replicate those of a constant product market. Moreover, the reference price of one of the coins in terms of the other coin follows a Stochastic Volatility model.
Everything is coded in Python, and I use Pytorch to parametrise the policy of the agent with a neural network, and train it. More specifically, I use an actor-critic type algorithm, where the policy of the agent is 'soft' (i.e. it is a random variable, instead of being deterministic), in order to include exploration-exploitation within the policy. The algorithm is inspired from here.
Challenges we ran into
My laptop has not been enough to achieve an optimal strategy. Reinforcement Learning algorithms need lots of data, and if possible GPUs to perform all the matrix operations involved in algorithhimc differentiation. However, my results are decent and I find a strategy that avoids impermanent loss in a fair number of scenarios.
Motivation and Future steps
I believe this type of projects, scaled-up using DeFI data can really help in building safer protocols. Agent-based simulation using techniques from Mathematical Finance and Machine Learning can help finding and dealing with extreme events, and at the same time finding the optimal equilibrium settings so that all the actors involved in a protocol have an optimal profit.
In a future projects I plan to train my models with relevant DeFI data, which at the moment is being a little bit elusive to get.
Log in or sign up for Devpost to join the conversation.