posted an update

Introduction

We still want to make a minesweeper solver. However, after reading several papers we decide to change our benchmark. We will include them as related works later. Most of the trainings and testings are done on 4x4, 5x5, 6x6 grids and find the winning rate versus training iteration. We have seen some winning rates of over 90%. Our 32x32 grids might be too huge for an agent to learn anything, but we would still try it. We also would like to give an introduction of what is a minesweeper, more importantly, it has been found that determine whether a given mine configuration satisfies the board constraints is NP-complete. Enumerating all possible situations belongs to #p-complete class problem. This minesweeper solver is trying to predict the tile of gird that has the lowest possibility to be a mine when uncovered. We try to design a policy network and a value network with CNN layers. We would also try to use a pure convolutional network as a benchmark for prediction.

Challenges

We are reading papers to make sure our idea hasn't been implemented by others. It is not challenging but a little bit time-consuming.

Insights

We are still designing the model. We plan to have residual connections and batch normalization layers. But we might need to change our model once we find someone has implemented such a model.

Plan

We are spending time reading papers right now, and it might take several more days. We plan to set up the game environment recently, we expect these two jobs to finish at the same time. We might spend more time tuning our model. We just changed our benchmark. And our goal right now is just achieving a higher winning rate instead of getting a high score.

Log in or sign up for Devpost to join the conversation.