Inspiration
The recent AI hype got us curious about how we can implement machine learning algorithms without mindlessly using machine learning libraries in Python.
What it does
The Tic-tac-toe bot implements a Q-learning reinforcement model and after sufficient training can play at a perfect level.
How we built it
With python. Mostly numpy for data manipulation and tkinter for UI.
Challenges we ran into
At first we tried to train the agent against itself, but because of the way we calculated the Q-value rewards got mixed up. After some modification we fixed the bug and trained the agents against random moves.
Also, finding the optimal hyperparameters proved time-consuming.
Accomplishments that we're proud of
The bot can now play perfectly after sufficient training!
What we learned
How Q-learning works and how it can be implemented to games with small space complexity.
What's next for Tic-tac-toe
On the front end, we'd love make the UI more polished, maybe by making a web app. On the back end, we'd like to add PVP mode and 4x4, 5x5 Tic-tac-toe.
Log in or sign up for Devpost to join the conversation.