Data without labels is difficult for people to classify with the tools that are out there. Yet, these labels are critical for machine learning and data science to answer questions that we otherwise can’t. Without human interaction, these algorithms are inaccurate. So we are turning classifying data into a game.
Until now, labeling such data has been time-consuming. Our technology enables anyone, anywhere in the world, to help train the models that will make the world of tomorrow a better place.
What it does
While our platform was designed for general use, we developed it with an eye on emerging problems such as misinformation and access to reliable sources during times of crisis. The current COVID-19 pandemic has shown that there is a need to rapidly bring experts in public health, the general public who are spread out and isolated from each other, and policy-makers into the fold in order to vet information. Our platform gives everyone a voice, and a chance to push back against the darkness in which this crisis grew.
Aside from being used to label data, our platform can also be used to improve machine learning algorithms. Human intuition is naturally a lot more powerful than most state of the art classification algorithms. Results from our platform can be used to re-train these algorithms, thereby improving their performance.
How we built it
We used UIPath data scraping along with parameters from user input to scrape platforms like twitter, Reddit, etc. We used a google app engine to make an API for the ML model and compared the accuracy of the model on a dataset to the user's intuitive accuracy for the same dataset using a Flask backend.
Challenges we ran into
We had a tough time integrating the ML model with our web app. We had to use a google bucket and app engine to generate an API for the same We also used MongoDB for storing the results
Accomplishments that we're proud of
We were able to successfully scrape the data using UI Path and build a highly sophisticated software that uses virtual machines and google cloud platform to run machine learning models and download generated spreadsheets, all this while giving user the privilege to label data by playing a game.
What we learned
We learned a lot about Google Cloud Platform, MongoDB, UI Path and how to deploy machine learning models on web apps.
What's next for DataFighter
We are trying to add more scraping options like websites, articles, news, etc.
Work in this hackathon
We had started ideation and thinking about the hack a few days ago. In this hackathon, we worked on the front end dealing with gamifying our idea, hosted our project using a domain from domain.com and worked on the whole backend including the integration of UI path into our hack.