Inspiration

We wanted a tool that could understand online communities on different platforms and surface the kind of insights that are usually hidden in comments, posts, and profiles. Every platform has different data, different signals, and different vibes, but nobody pulls it together in one place. So we built Identivibe to explore that idea.

What it does

Identivibe takes a social media profile like Instagram, Reddit, LinkedIn, or YouTube, and automatically collects public data like comments, posts, profiles, and topics. It then organizes everything into a clean, structured JSON payload. From there, you can build analytics, dashboards, or AI features on top of it.

How we built it

We built a unified scraping system in Python with lazy-loaded platform modules. Reddit uses public JSON endpoints, LinkedIn and Instagram run on Apify actors, and YouTube uses the official Data API. We also added a storage layer that saves results into JSON files in MongoDB so the data can be used later for analysis or demos. We focused on making everything consistent across platforms so the output format is predictable.

Challenges we ran into

Every platform is different and has its own rate limits, formats, and rules. We ran into issues with API quotas, differences in how platforms structure comments and profiles, and cases where comments are disabled. We also struggled a bit with organizing everything cleanly so developers don’t need every token just to run one platform. Lazy loading fixed that. Making LinkedIn and Instagram work through Apify actors took some tinkering with the right input formats and dataset handling.

Accomplishments that we're proud of

We’re proud that we got full multi-platform scraping working end-to-end with one entry point. We also made the code clean enough that anyone can extend it and build features on top. The Instagram and YouTube pipelines were pretty complex, but we got them to return consistent, usable data. We’re also happy that the project wasn’t just about scraping, it actually structures data in a useful way.

What we learned

We learned how different social platforms handle data, rate limits, and access. We learned how to work with Apify actors, how to manage API keys, how to avoid crashing when comments are disabled, and how to make a modular Python system that doesn’t require everything to be installed at once. We also learned how useful it is to have standardized data when you want to run analysis or build ML tools.

What's next for Identivibe

Next, we want to add more platforms (TikTok, Twitter/X), build a simple dashboard to visualize the data, and add features like sentiment analysis and clustering. We also want to make the setup more plug-and-play so non-developers can run it or feed it into AI tools directly. There’s a lot of room to explore how different communities behave online, and we think this is just the starting point.

Built With

Share this project:

Updates