What It Does

With the success of Bitcoin, the first widely used Cryptocurrency, as of March 2021, there are over 18 million Bitcoins in circulation with a market capitalization of almost US$ one billion, there are many competing cryptocurrencies entering the market, together making up a total of $1.67T global market capital (https://coinmarketcap.com/). The bigger the scale of the cryptocurrency market is, the higher demands these currencies generate, resulting in the dynamically-changing values of these coins to extremely go ups and downs each day. With more than 5,000 virtual currencies exist nowadays, the investors need to consider many factors before deciding to invest in one including the utility of the coin in real life, its position compared to other currencies, and the valuation. It is essential to do some intensive research before making any decision in buying this volatile asset. One interesting question is raised: if a machine or an agent is not given any labeled or past data, will it be able to trade and gain profit by a given amount of money only from observing the underlying patterns in the rewards it gets from the environment of the market?

How I Build It

The agent or the bot that does reinforcement learning with n-step TD approach was created as an object on Python. All the codes were written in Python language on Google Colab. The pseudocode of n-step TD was brought from the book "Reinforcement Learning".

Challenges I ran into

Sometimes, the reinforcement learning approach was limited due to the size of state-space which makes the convergence rate slower.

Accomplishment I am proud of

Writing code to create an agent for reinforcement learning is not something that can commonly be found on Github, even more difficult as I made it specific to n-step on policy TD. Therefore, I am proudest of myself that I could study the concept of it, follow the pseudocode on the book, and got it all done by myself.

What I have learned

Reinforcement learning may not be feasible for all the datasets and also with the simple environment as an agent needs to take several factors for consideration.

What next to learn

I plan to improve my agent by creating more complexed simulation of environment so that it could learn better. Also, I wish to use more detailed sets of data, for example, the data per minute.

Built With

Share this project:

Updates