TL;DR
TriageTap is an evidence-cited AI SOC copilot. It turns phishing reports + hybrid logs/files into one incident case with a merged timeline, deterministic detections, and an AI triage summary that includes verbatim evidence quotes you can verify.
Inspiration
I applied for an on-campus SOC Analyst role and got rejected. Instead of stopping there, I used it as fuel to build the hands-on skills I was missing. The fastest way to learn real SOC work (triage → investigate → prove → report) was to build a SOC-style tool that forces me to analyze real artifacts and produce explainable, evidence-grounded results.
What’s working now (demo-ready)
1) PhishGuard: phishing triage → case creation
- Paste an email/SMS/URL
- Deterministic IOC extraction (URLs/domains + emails/phones when present) and red-flag heuristics
- AI classification + safe reply + recommended next steps
- One click: Create Phish Case → stores the report + IOCs as a case event
2) Hybrid log ingestion: web + auth + CloudTrail
- Upload real log sources (web access logs, SSH auth logs, CloudTrail JSON)
- Normalize them into a unified incident view (single case timeline)
3) Explainable case view (AI + rules + timeline)
Each case shows:
- AI Verdict (incident summary)
- AI Red Flags (evidence quotes) pulled directly from ingested text/log lines
- Deterministic Red Flags (rule-based signals)
- Next Steps (actionable investigation/containment guidance)
- A merged Timeline across sources (web/auth/cloud events)
4) File intel enrichment (OPSWAT MetaDefender)
- Optional multi-engine scan summary (verdict, detection ratio, hashes, and engine results)
- Results can be attached to a case as supporting intel (and removed instantly by deleting the case)
How I built it
- Built a Spring Boot backend + lightweight web UI with two workflows: PhishGuard and SOC Copilot
- Implemented ingestion + normalization for hybrid sources (web/auth/CloudTrail)
- Added deterministic detection flags so results stay explainable
- Designed AI outputs to be evidence-grounded (quotes from the artifact, not “AI guessing”)
- Integrated OPSWAT MetaDefender for multi-engine file intel with an “attach to selected case” option
- Added privacy controls: raw log storage is off by default and cases can be deleted in one click
Challenges
- Normalization + timestamps: different sources don’t align cleanly; building a readable merged timeline took iteration.
- Keeping AI honest: I prioritized provability over fluency, so outputs had to cite exact evidence rather than speculate.
- Correlation: connecting web recon → SSH activity → CloudTrail actions into one coherent narrative without over-claiming.
- Third-party intel constraints: multi-engine scanning has rate limits and privacy tradeoffs, so I built it as an optional, attachable intel step.
What I learned
I learned how SOC investigation flows work end-to-end: ingesting messy telemetry, extracting signals, building timelines, and communicating findings in a way that can be verified. The biggest lesson: AI is only useful in security when it stays grounded in evidence.
Next steps (presentation polish)
These are incremental upgrades on top of what’s already working:
1) Clickable evidence IDs (e.g., [E12]) that jump to the exact timeline entry
2) Investigation Mode (Next Best Evidence): AI asks 2–4 follow-up questions; answers re-triage the case
3) Privacy Share Pack export: redacted report + redacted evidence JSONL + hashed IOCs
4) Tamper-evident proof: compute SHA-256 of the share pack and optionally anchor {case_id, hash, timestamp}
5) Voice briefing: generate a short audio incident briefing from the triage summary
6) Rule Forge preview: mark TP/FP → AI suggests rule tweaks → preview “alerts before vs after” (no auto-apply)
What’s next
My next step is to turn TriageTap into a real-time system checker with an offline AI mode:
- Add a lightweight local agent that streams system logs/events into the timeline continuously (instead of only file uploads).
- Run detections in near real-time as events arrive.
- Replace cloud LLM calls with a local/offline model (on-prem) so sensitive logs never leave the environment.
- Keep the same evidence-citation rule: offline AI can only cite existing
event_ids [E##] from the case.
Log in or sign up for Devpost to join the conversation.