chARvis

Inspiration

Our idea was based off of JARVIS within the Marvel Universe. We wanted to produce some form of augmented reality glasses that could help you with day to day activities and provide information through a display.

What it does

Our program takes the video output from a phone (eventually a camera in AR glasses), identifies a chess board and then identifies the pieces using a neural network. The neural network was trained using data we collected and can identify pieces with a very high accuracy even with a low resolution (37px × 37px). We send the board information to Stockfish to calculate the optimal move which is then sent to the device to display the move using an arrow onto the chess board. Eventually this would be projected inside the glasses rather than on a separate display. It also speaks the move to the user using a text-to-speech engine.

How we built it

We originally built a prototype that can play checkers, and trained the model to recognise tokens of different colours. We used fiducial markers to allow identifying where the board is, and split the board into 64 squares to find each piece.

We then attempted to train the network using photos of chess pieces on the board. However, we initially did not have enough training data and so we had to manually sort a relatively large dataset (1200 images) and increase epochs until a high accuracy was produced. The board position is taken and sent to a chess engine to find the best next move and an arrow is displayed onto the chess board.

Challenges we ran into

Integration was an issue since each program had different input types and output types. For example, the output type of the second program was a list of integers representing the piece (e.g. white bishop = 5) and then this had to be turned into a board state to be passed into the chess engine. This was difficult but we used our team working skills to smoothly combine our code.

A major issue we faced on the second day was the pieces were not being correctly identified due to the low quantity of training data (8 images per piece). This meant we had a piece accuracy of 70% but the chess engine requires valid states to run, so we needed to increase this closer to 100%. We provided more training data by providing a level of automation where you type a letter representing the piece to automatically sort the data. Once the model became more accurate we automated this even further using the existing model to create an initial classification of the image, only requiring human input to check and correct the classification. We also increased the training time since the training data matches quite closely to the board so overfitting the training data was not a significant issue.

Accomplishments that we're proud of

We gained many new skills, such as machine learning through Tensorflow and learning how to produce large training and testing datasets efficiently. In addition we learnt how to use computer vision using the Aruco fiducial markers and splitting this into squares such that it can be processed by the ML network. We are very proud that we were able to produce a functional prototypes as well as a final product with all the features we wanted, such as having a voice output of the move being made, in such a short amount of time. Also, working as a team, building different parts of the project and then being able to successfully integrate them together was something we were proud of since it gave us a highly efficient workflow allowing for us to complete this project in a short time frame.

What we learned

We learnt how to break a complex problem down into discrete steps that each person can code independently to allow for efficient coding. We also talked about how we would format our data so we can quickly integrate the separate subsystems. We learnt how to use computer vision to produce a normalised chess board. Without this, this program would not be possible since the warped squares could not be sorted into 64 positions easily and this would've made many other parts of the program significantly harder if not impossible.

What's next for chARvis

We want to expand into producing AR glasses with a more discrete camera. This does require some processing power so will most likely connect to a phone for processing. The AR glasses will provide the arrow as seen on the computer screen. We will also have the move said to the person through integrated headphones in the glasses instead of out loud.