Limited labelled data was a continual challenge for friends and colleagues researching and training machine learning models. This either limited their accuracy or stalled their progress as the data was manually labelled.
Observed individuals on public transit playing monotonous games for extended periods of time.
If this data has to be labelled manually, thought it would be faster and more fun if we could make it into a mobile game.
What it does
Currently handles data sets requiring image classification. Unlabelled images are loaded onto a GCP server. When Android game is started, it retrieves the images from the server and compiles it into a swiping game. Players face a continuous stream of images that must be swiped into particular regions on the screen. Regions correspond with a predefined set of labels and labels are returned to the server after a round of images.
Post processing After the labels are received, there will be a filtration process where the labels with the highest votes by players are chosen. Players who labeled otherwise will be penalized by lower their "ranks". Also, there is an experimental unsupervised convolutional neural network algorithm that clusters images according to the visual similarities which can then be used to determine the probability of the labels being accurate.
How we built it
Used Google Cloud Platform to spin up a Debian development instance. Configured Flask for server handling of databases and images and integration of post processing scripts. Android Studio used to develop the mobile swiping game. Python used for post processing of submitted labels.
Challenges we ran into
Where to start; where do you even develop. Mentors guided us toward a GCP VM instance.
What we learned
How to delegate tasks toward team members capabilities and learning interests. Coordinating backend and frontend integration. Planning development workflow and working with conflicting capabilities and limitations.
What's next for labelLOH
Handling other data sets