Inspiration
The Problem is Real: Every IT department faces the same nightmare—employees secretly sign up for unauthorized SaaS tools. You discover Slack integrations storing customer data, shadow expense accounts, rogue HR software with access to SSNs. No one in IT knows about them. No audit trail. No control.
One organization we researched had $1,500/month in duplicate tools silently draining budget—Asana + Monday.com + Notion all doing the same job in different teams. But the real pain? The security exposure: An unauthorized password manager with access to 200+ employee passwords. A random recruiting app with SSNs. A free analytics tool processing customer data.
IT managers waste 20+ hours monthly manually hunting through spreadsheets and expense reports. When they finally find something, they have no playbook—just hope that revoking access doesn't break critical workflows.
Shadow SaaS Detector solves this: Find every unauthorized tool in minutes. Score the risk. Get a remediation playbook. See the savings. Act with confidence.
What it does
Shadow SaaS Detector is an enterprise-grade shadow IT discovery platform that finds unauthorized SaaS applications and tells you exactly what to do about them.
How it works:
- Upload your organization's data (expense reports, browser history, employee roster)
- Detect all unauthorized SaaS apps in seconds using AI-powered keyword matching
- Score each app's risk: CRITICAL (immediate revocation), HIGH (review required), MEDIUM (consolidation candidate), LOW (monitor)
- Analyze compliance violations (GDPR/CCPA/SOC 2/HIPAA), identify duplicate tools, calculate savings
- Act using step-by-step playbooks, simulation tools, and audit logs
Core Features:
- 🔍 SaaS Detection Engine: Matches expenses + browser history against 500+ SaaS database
- 🤖 AI Risk Scoring: Google Gemini analyzes data access, security reputation, compliance risk (with rule-based fallback)
- 📊 Executive Dashboard: Real-time view of all shadow apps by risk level, department, category
- ⚠️ Threat Ticker: Live feed of critical risks as they're detected
- 🗺️ Attack Surface Map: Visualize which departments access risky tools and what data is exposed
- 💡 AI Insights: Get AI-powered analysis on risk, consolidation opportunities, and compliance violations
- 📋 Smart Consolidation: Identifies 3 identical tools where you could eliminate 2 (with ROI)
- 📜 Compliance Reports: Auto-generated GDPR/CCPA audit reports (exportable as HTML/PDF)
- 💰 Cost Simulator: Test impact of revoking apps before you do it
- 📋 Playbooks: Step-by-step remediation guides for each app
- 📄 Executive Brief: C-level one-pager with KPIs (shadow spend, critical risks, consolidation savings)
How we built it
Frontend (React 19 + Vite + TypeScript):
- Dashboard: Multi-tab interface (Dashboard, Threat Map, Simulator, AI Insights, Demo)
- Upload Wizard: Multi-step form for expenses, browser history, roster, Slack apps
- Interactive Visualizations: Recharts for cost breakdown, risk distribution, department heat maps
- Rich Components: Playbook modals, threat ticker, executive brief (with print support)
- 708 lines of intentional CSS: Dark theme, smooth animations, form styling
- Code:
src/components/(Dashboard.tsx, AIInsights.tsx, Simulator.tsx, etc.)
Backend (Express.js + TypeScript + Google Generative AI):
- Detection Engine (
detector.ts): CSV parsing, SaaS database matching, browser history analysis - AI Risk Scoring (
ai-risk-scorer.ts): Multi-prompt Gemini integration with rule-based fallback - AI Compliance (
ai-compliance.ts): GDPR/CCPA/SOC 2/HIPAA violation detection - AI Consolidation (
ai-consolidator.ts): Identifies category duplicates, calculates savings - Simulators (
simulator.ts): Risk modeling, impact prediction - Playbook Engine (
playbook.ts): Generates step-by-step remediation guides + email drafts - Audit Trails (
playbook.ts): Logs all simulated revocations to JSON - API Routes:
/api/upload,/api/ai/risk-assessment,/api/ai/consolidation,/api/ai/compliance,/api/playbook,/api/simulate
Data Processing:
- SaaS Database: 500+ entries (saas_database.json) with keywords, risk levels, data permissions
- Expenses: CSV parsing with expense → SaaS matching
- Browser History: JSON parsing, domain extraction, SaaS correlation
- Slack Apps: Real Slack workspace integration (OAuth flow ready)
- Caching: In-memory cache for AI assessments (avoid API waste)
Testing & Quality:
- Unit Tests: Vitest covering detector, simulator, playbook logic
- E2E Tests: Playwright test suite for full user workflows
- Type Safety: Full-stack TypeScript, no
anytypes in core logic - Code Linting: ESLint configured for React + TypeScript
Challenges we ran into
Risk Scoring Without ML: Needed to score apps intelligently without training ML models. Solution: Multi-factor weighting (data access 40%, risk_level 30%, user attribution 20%, compliance violations 10%) + AI fallback when API available.
Handling Unstructured Expense Data: Expense descriptions vary ("Slack $100", "Monthly subscription: Slack", "Slack Pro Team Plan"). Solution: Keyword-based matching against SaaS database keywords field (case-insensitive, substring match).
Browser History Correlation: Browser history has millions of URLs; finding SaaS apps is expensive. Solution: Extract domain, match against known SaaS domains, cache results.
Compliance Framework Accuracy: GDPR/CCPA violations sound similar but have different requirements. Solution: Built separate checkers for each framework, rules-based with AI enhancement.
Graceful AI Fallback: Google API might timeout or be unavailable. Solution: Implemented rule-based fallback for every AI feature (risk, consolidation, compliance) that produces identical JSON output.
Data Privacy: Handling employee emails, SSNs, browser history safely. Solution: In-memory processing only, no logging of sensitive data, audit trail focuses on app decisions not employee details.
Accomplishments we're proud of
✅ Full-stack AI integration - Detection + Risk Scoring + Consolidation + Compliance all AI-augmented
✅ Graceful degradation - AI features work with or without API key (fallback logic for each)
✅ Real data processing - Handles messy CSVs, JSON, browser history, OAuth app lists
✅ Multi-framework compliance - GDPR, CCPA, SOC 2, HIPAA violation detection built-in
✅ Production-ready code - Full TypeScript, proper error handling, modular services
✅ Comprehensive testing - 17+ unit tests + Playwright E2E suite with auto-screenshots
✅ Beautiful, functional UI - Custom CSS animations, dark theme, responsive design
✅ Executive-grade reports - Exportable HTML/PDF compliance reports with styled tables & charts
✅ Simulation engine - Test revocation impact before executing (cost modeling, workflow impact)
✅ Audit trails - Every simulated action logged to JSON (audit_log.json, revokes_demo.json)
✅ Clean commits - 11 commits with clear messages showing iterative development
What we learned
- Data-driven security decisions beat guessing: Exposing shadow IT as CSV/dashboard makes decision-making fast and evidence-based.
- Graceful AI fallback is essential: Never hardcake LLM responses—always have a deterministic fallback for reliability.
- IT managers care about consolidation ROI more than risk scores: Show "$50k/year savings possible" and they act faster than "CRITICAL RISK."
- Compliance frameworks are rules engines: GDPR/CCPA can be modeled as rule sets, with AI enhancement for edge cases.
- Full-stack TypeScript eliminates integration bugs: Type safety in data transformations (CSV → detection → risk → export) prevents silent failures.
- Simulations reduce executive anxiety: "Here's what happens when we revoke access" is more persuasive than "This is risky."
What's next for shadow-saas-detector
- Live integrations: OAuth connections to Google Workspace, Microsoft 365, Okta, GitHub for real-time SaaS discovery (currently CSV upload only)
- Continuous monitoring: Recurring browser history sync instead of one-time uploads
- Automated enforcement: Network-level app blocking + SSO controls + cloud access broker integration
- Custom policies: IT team defines their own risk rules ("No AI tools without contract review")
- Slack bot:
@SaaSDetector check [app-name]for instant risk lookup - Industry benchmarking: "Compare your shadow spend to similar companies in your industry"
- Cost tracking: Timeline view of shadow spend month-over-month
- API for integrators: Partner ecosystem (MDR, ITSM, DLP vendors can leverage detection)
Built With
- aiplaywright-(e2e)
- express.js
- generative
- json-based
- railway
- react
- recharts-(data-visualization)
- typescript
- vite
- vitest
Log in or sign up for Devpost to join the conversation.