To train machine learning models, we need a lot of data. Unfortunately, the datasets we usually have are either

1) too hard to find for ordinary people, 2) filled with copyright restrictions, or 3) plain inaccurate.

What it does

crowdBit crowdsources information from ordinary users. The process starts with data scientists, who submit a request for a particular topic of data. The request will be seen by users, who will upload any type of files to crowdBit, which will then run through a well-trained machine learning model to determine accuracy. Files that are inaccurate will be discarded. Files that are somewhat accurate will be sent to other users for additional verification. Files that achieved very high accuracy rate will automatically be accepted into the dataset. Once the dataset grows large enough, the data scientist will be able to download the dataset for his work.

How I built it

Android studio for front-end that uses Firebase for user authentication. Node.js with Express for server-side. Clarifai for Machine Learning file verification. MongoDB Stitch & Atlas for storing user files.

Challenges I ran into

  • Outdated documentation
  • Software bugs
  • Sleepiness

What's next for crowdBit

  • Rating system for users for additional accuracy
  • Expand capabilities of the server to become more than just an API endpoint
  • Making more interactive and accessible front-end experience through mobile applications

Built With

Share this project: