limestone

The training graphs for the network
one of the few correctly estimated points :(

Inspiration

Osu! players often use drawing tablets instead of a normal mouse and keyboard setup because a tablet gives more precision than a mouse can provide. These tablets also provide a better way to input to devices. Overuse of conventional keyboards and mice can lead to carpal tunnel syndrome, and can be difficult to use for those that have specific disabilities. Tablet pens can provide an alternate form of HID, and have better ergonomics reducing the risk of carpal tunnel. Digital artists usually draw on these digital input tablets, as mice do not provide the control over the input as needed by artists. However, tablets can often come at a high cost of entry, and are not easy to bring around.

What it does

Limestone is an alternate form of tablet input, allowing you to input using a normal pen and using computer vision for the rest. That way, you can use any flat surface as your tablet

How we built it

Limestone is built on top of the neural network library mediapipe from google. mediapipe hands provide a pretrained network that returns the 3D position of 21 joints in any hands detected in a photo. This provides a lot of useful data, which we could probably use to find the direction each finger points in and derive the endpoint of the pen in the photo. To safe myself some work, I created a second neural network that takes in the joint data from mediapipe and derive the 2D endpoint of the pen. This second network is extremely simple, since all the complex image processing has already been done. I used 2 1D convolutional layers and 4 hidden dense layers for this second network. I was only able to create about 40 entries in a dataset after some experimentation with the formatting, but I found a way to generate fairly accurate datasets with some work.

Dataset Creation

I created a small python script that marks small dots on your screen for accurate spacing. I could the place my pen on the dot, take a photo, and enter in the coordinate of the point as the label.

Challenges we ran into

It took a while to tune the hyperparameters of the network. Fortunately, due to the small size it was not too hard to get it into a configuration that could train and improve. However, it doesn't perform as well as I would like it to but due to time constraints I couldn't experiment further. The mean average error loss of the final model trained for 1000 epochs was around 0.0015 Unfortunately, the model was very overtrained. The dataset was o where near large enough. Adding noise probably could have helped to reduce overtraining, but I doubt by much. There just wasn't anywhere enough data, but the framework is there.

Whats Next

If this project is to be continued, the model architecture would have to be tuned much more, and the dataset expanded to at least a few hundred entries. Adding noise would also definitely help with the variance of the dataset. There is still a lot of work to be done on limestone, but the current code at least provides some structure and a proof of concept.