Inspiration
The Dual Mandate of balancing price stability with full employment is one of the most complex control problems in finance, specifically for the Federal Bank. I was inspired by the massive impact of policy lags where the momentum of a recession consequently leads to a trickle-down-effect to all peoples; impacting the most vulnerable at higher levels. I wanted to see if an agent with Memory through LSTMs and Sentiment Awareness via LLMs could tighten the feedback loop and reduce the Taylor Gap.
How I Built It
I constructed Partially Observable Markov Decision Process where the agent operates under the same constraints as the real Federal Reserve, through a PPO +LSTM approach. Unlike standard reinforcement learning, the LSTM allows the agent to maintain a belief state about hidden economic variables like the natural rate of interest and natural unemployment. The goal was using NVIDIA NIM Llama-3-70B hosted on Modal, the agent would process news headlines from off-weeks. This translates market noise into a numerical credibility score so the agent can perceive if its policy is trusted by the market.
Challenge: I faced a time constraint for the agent to converge. Training a Recurrent PPO agent in a high-dimensional and non-linear environment would take over 4-hours per training session. The framework is fully functional of which I would love to speak about more on (please feel free to reach out). I could not reach the optimal policy within this proof of concept. With this in mind, I hope to continue this work to provide a high-fidelity decision support tool for the Federal Bank with the potential to expand other tools to other banks within the system. Serving as a way to aid to serve our institutions in the face of uncertainty, to promote the economic wellbeing of the people. Note: The Demo Link is a proof of concept synthetic replica, as optimal convergence could not be achieved at this time.
Built With
- llama-nvidia
- modal
- stablebaselines
Log in or sign up for Devpost to join the conversation.