Data without labels is difficult for people to classify with the tools that are out there. Yet, these labels are critical for machine learning and data science to answer questions that we otherwise can’t. Without human interaction, these algorithms are inaccurate. So we are turning classifying data into a game.

Until now, labeling such data has been time-consuming. Our technology enables anyone, anywhere in the world, to help train the models that will make the world of tomorrow a better place.

What it does

While our platform was designed for general use, we developed it with an eye on emerging problems such as misinformation and access to reliable sources during times of crisis. The current COVID-19 pandemic has shown that there is a need to rapidly bring experts in public health, the general public who are spread out and isolated from each other, and policy-makers into the fold in order to vet information. Our platform gives everyone a voice, and a chance to push back against the darkness in which this crisis grew.

Aside from being used to label data, our platform can also be used to improve machine learning algorithms. Human intuition is naturally a lot more powerful than the most state of the art classification algorithms. Results from our platform can be used to re-train these algorithms, thereby improving their performance.

How we built it

We used a google app engine to make an API for the ML model and compared the accuracy of the model on a dataset to the user's intuitive accuracy for the same dataset using a Flask backend. We used UIPath data scraping along with parameters fro user input to scrape platforms like twitter, Reddit, etc.

Challenges we ran into

We had a tough time integrating the ML model with our web app. We had to use a google bucket and app engine to generate an API for the same We also used MongoDB for storing the results

Accomplishments that we are proud of

We were able to build a highly sophisticated software that uses virtual machines and google cloud platform to run machine learning models simultaneously while scraping the data using UI Path on a virtual machine.

What we have learned

We learned a lot about Google Cloud Platform, Twilio API, UI Path and how to deploy machine learning models on web apps.

What's next for Datafighter

We are trying to add more scraping options like websites, articles, news, etc.

Share this project: