Datafy is a web scraped data provider website
Effortlessly download CSV and JSON Data of Companies listed in Fortune 500


Ever toured websites such as Linkedin, Fortune500, Forbes and believed it would have been exceptional if you could perceive all the data existing in the website in an easy-to-use format preferable a CSV or JSON, and then apply your codes or statistical analysis?

It's a yes for several data scientists and business analysts who frequently invest their time trying to find suitable data and regenerate it into a suitable format so that they can apply data visualization and other data analytics.

Datafy originated at that instant when I was once challenged by my internship company to accumulate the data of top retailing companies in the world. After a day of googling, I landed at a conclusion to use the Fortune500 website as my reservoir of data. But why stop at one domain when I can apply it for all potential circumstances? And that's how it all started.

Visit the website

GITHUB code repository

Demo video

What it does

Datafy is a web scraped data provider website, effortlessly download CSV and JSON Data of Companies listed in Fortune 500

Datafy for:

  • Developers: CSV and JSON format make data convenient to handle and implement various data science tools, PowerBI, and it stands developer-friendly

  • Procurement & Spend Analytics: Datafy lets businesses discover the apt vendor or clients that can help their company reach the precise target audience and be lucrative

  • Investors: Datafy exhibits revenue change, profit rate, and other former statistics of companies which assists the investors and stockholders get sound shrewdness through data, and statistics

  • Seekers: Datafy provides the records of prestigious organizations honored by Fortune 500, where every student and a job seeker endeavors to be an employee

How I built it

  • Web Scrapping: I used Selenium, Chrome driver, python for web scrapping the data from The Fortune500 website.

Github link to web scrapping tool

  • Database: This data that has been scrapped is converted into JSON and CSV format and stored in Firebase Cloud Storage

  • Website: Developed a beautiful web application (for best UI use in laptop/desktop) that sends a request to download the data from the firebase cloud storage and saves it on the user's local system.

Check out the website

Challenges I ran into

  • Deployment: Heroku does not allow any request that lasts for more than 30sec and selenium web scrapping is a time-consuming process, hence I need to switch to Firebase as my server and send a download request to the database rather than creating one instantaneously

  • UI: The UI is not very compatible with mobile screen resolutions

  • Download: Creating a download request in JSON and CSV format was giving some internal server error when hosted

Accomplishments that I'm proud of

  • I have learned to use the Firebase cloud storage and how to send a download and insert request

  • Making a dark-themed UI that is compatible with most of the device is a great achievement for me

  • Putting my web scrapping skills into help for other data scientists is a commendable achievement

What I learned

  • Flask Cloud storage

  • HTML, CSS scroll button feature and cards system

What's next for DATAFY

Currently, Datafy focuses on extracting data from The Fortune500 website, but in LinkedIn, there are more than 50 million companies which I aim to extract information from and present to all the Data scientists out there in the world.

Built With

Share this project: