Inspiration
We found that people have extreme amounts of publicly accessible information found in areas as simple as their Instagram account, and built a tool to help people identify where they are leaking information before others do.
What it does
NoDoxx uses a 5-step process to thoroughly identify all public information associated with your Instagram account and assist you with removing all information that attackers could use against you.
1. Identity Cross-Reference: Your Instagram username is queried across 400 different social media platforms and websites to locate other real public accounts that belong to you.
2. Geolocation Via Posts: All of your public Instagram posts are crawled and run through a geolocation algorithm that identifies the general location and coordinates by scraping image metadata, Instagram location tags, and public city landmarks.
3. Digital Footprint Locator: Your Instagram handle is converted into an estimated name, where it is Google dorked/queried to over 200 public data sources and API's (including LinkedIn) to pinpoint your true identity, email address, and phone number.
4. Email Data Scraping: Your email is run through the IntelBase API, locating all public accounts created with your email, all online reviews left by your email, and all database breaches your email was involved in.
5. Digital Anonymity Restoration: NoDoxx provides step-by-step guides on how to remove or edit any posts and public information related to you that compromise your digital safety, and restores a clean digital footprint across both your Instagram and overall digital identity.
How we built it
Frontend React 18 + TypeScript on Vite, styled with Tailwind in a kali-terminal style. The 3D globe is custom-built in three.js, and the obsidian-style knowledge web is rendered with react-force-graph-2d.
Backend A FastAPI app on uvicorn that runs the entire audit in-memory. Findings stream to the dashboard over Server-Sent Events. We also implemented a cost tracker that tracks and caps LLM spend.
Reasoning & search The LLM layer uses Anthropic. ClaudeWeb search is powered by Serper.dev.
Identity sweep/Crawling Instaloader pulls the public Instagram profile, then a Sherlock-style sweep probes 400+ platforms in parallel. Matches are scored with rapidfuzz, imagehash, and sentence-transformers.
Digital footprint Serper-powered Google dorks feed trafilatura for full-page extraction. Discovered emails are run through the IntelBase API and HaveIBeenPwned.
Geolocation Every post image goes through Pillow EXIF, GeoCLIP (CLIP-L → GPS coords), and Claude vision for landmark recognition. An algorithm then clusters the signals onto the densest geographic region.
What's next for NoDoxx
NoDoxx will implement an Instagram OAuth system immediately after the hackathon to ensure that nobody is able to use it to search for data on Instagram accounts they do not own. NoDoxx will also be hosted online and will implement multiple web security practices and penetration tested across the "OWASP Top 10" to ensure that hackers are unable to steal session IDs and sensitive data while you are using the platform.
Built With
- api
- cybersecurity
- intelligence
- osint
- python
- webscraping
Log in or sign up for Devpost to join the conversation.