Intellidata was conceived as a solution to a pain point that we constantly experienced as Startup Founders which was access to clean, affordable and usable datasets. It was evident that behind our research was either a founder attempting to test a prototype, or a data scientist who is swept with missing fields, but no matter what, the data was not right, was too costly, or altogether missing. Being inspired by our personal experience and the increasingly high demand to find the solution to incomplete data, we decided to develop the platform that not only corrects missing data but also creates a synthetic dataset that meets the particular industry needs. The possibility that has been seen is to have a programme where innovators, whether low-cost or high-cost, would have access to high-quality data to drive their ideas.

The process of developing Intellidata was very tough but gratifying. We implemented Python as our main language, Django to create the backend, and Streamlit to create the user interface. In the case of the Artificial Intelligence and data generating engine we used PyTorch and scikit-learn, and our database was PostgreSQL and our infrastructure running on AWS. The issue of prioritising the precision of synthetic data and the efficiency of its creation was another huge task to be considered, not to mention that all the efforts had to provide information that should be helpful in different fields such as healthcare, teaching, or finance. The other biggest challenge was to create an easy-to-use interface even to individuals with an average technical knowledge looking to get accurate data within a short period of time.

We are enthusiastic in how our solutions will help businesses to rapidly develop MVPs and test AI models without the use of expensive proprietary datasets. Our experience has taught us that it is not possible to generate good data easily by just having good algorithms, but it is important to understand the problems encountered in the real world and how they are to be used by the user. In future, we will consider providing users with further customization options and make our generation models smarter. The vision is to provide the most usable tool of ethical, realistic, and user-friendly synthetic data that will enable more individuals to take ideas to life.

Built With

  • amazon-web-services
  • custom-rest-apis-other-tools:-github-for-version-control
  • django
  • django-rest-framework
  • github
  • github-actions-for-ci/cd
  • postgresql
  • python
  • pytorch
  • pytorch-cloud-services:-aws-(ec2-for-compute
  • rds-for-database)-database:-postgresql-apis:-django-rest-framework
  • s3-for-storage
  • scikit-learn
  • streamlit
  • streamlit-machine-learning-platforms-&-libraries:-scikit-learn
Share this project:

Updates