Piratech

Inspiration

Vibe coding changed how fast people ship. It also changed what they ship — apps built in hours by developers who've never written a security review in their life, deployed to production with no idea what's lurking inside.

The problem isn't that security tools don't exist. Semgrep, Snyk, and GitHub Advanced Security are everywhere. The problem is that their output is completely unreadable to the people who need it most. A vibe coder staring at 47 Semgrep findings — javascript.express.security.audit.xss-no-direct-expression-object-injection-in-res-render — has no idea which ones matter, what they mean, or how an attacker would actually use them. So they close the tab and ship anyway.

Piratech is built for that developer. Paste your GitHub URL, and instead of a wall of cryptic rule IDs, you get a plain-English breakdown of exactly how your app can get hacked — taint path traced across your actual files, the specific line where auth fails, and the exact request an attacker would send to exploit it. Powered by the K2-Think-V2 reasoning model that reads your code the way a senior security engineer would, not a pattern matcher that fires on everything.

You built your app fast. Now, find out if someone can break it.

What It Does

Piratech is a real-time security analysis tool that finds genuine vulnerabilities in real codebases. You paste a GitHub URL (public or private with a PAT), and within seconds, Piratech runs industry-standard static analysis, then uses the K2-Think-V2 reasoning model to read the actual code context — tracing data flows across files, filtering false positives, and explaining exactly how each vulnerability is exploitable in your specific codebase.

The key output: instead of 10-25 raw Semgrep findings, you get 6 confirmed exploitable vulnerabilities, each with a taint path, an auth gap assessment, a step-by-step attacker exploit, and a concrete fix. It also registers a GitHub webhook so every future push reruns the pipeline automatically — no developer action required. If a new file increases the severity of an existing finding, Piratech flags the escalation with a before/after badge.

How We Built It

Piratech is a two-layer pipeline. Layer one is Semgrep, running a curated high-precision ruleset (OWASP Top 10, SQL injection, secrets) to produce a deterministic list of candidate findings as structured JSON. This layer never hallucinates — if Semgrep doesn't flag it, Piratech doesn't flag it.

Layer two is the K2-Think-V2 reasoning model that receives each finding alongside its full code context — the flagged file, the route file, auth middleware, the database layer — and performs four tasks: exploitability verdict, cross-file taint tracing, auth gap detection, and exploit path construction. The model is required to cite specific file and line evidence for every claim. Uncited speculation is suppressed.

The backend is FastAPI, orchestrating the pipeline, handling repo cloning via git clone, and streaming results to the frontend over WebSocket as findings are confirmed one by one. The frontend is React + Vite. Per-repo scan memory is handled by Backboard — a hosted memory API that maintains a persistent thread per repository, so re-scans have context on what was seen before without requiring a database. GitHub webhooks are registered at repo submission time and trigger the full pipeline automatically on every commit.

Challenges

Prompt engineering for the reasoning layer was the hardest part of the build. The deeper we pushed the model, the clearer it became that reasoning takes time — and that time explodes unless the prompt is ruthlessly structured. We had to force the model to stay grounded in real code with file-and-line citations, strict evidence requirements, and a hard context cap. Without that discipline, it generated confident but imaginary exploit paths; with it, we kept responses fast while still getting senior-engineer-level analysis.

Context management was the next bottleneck. Large repos instantly overflow the model if you include everything, and irrelevant files slow reasoning even further. We built a targeted file-selection strategy — flagged file, direct imports, auth middleware, database layer — capped at ten files per finding.

Webhooks added their own constraint. Getting end-to-end delivery required a public URL early, so we deployed to Vultr in the first two hours instead of building locally. That choice saved us later, since webhook latency directly affects how quickly findings stream to the frontend.

The more complex the repo, the more time the model needs — and the only way to keep that time predictable was to engineer the prompt, context, and infrastructure with precision.

Accomplishments We're Proud Of

The false positive filtering actually works. On our primary demo repo, Semgrep surfaced 17 candidate findings. Piratech confirmed 6 as genuinely exploitable — and the other 9 were correctly suppressed because sanitization, ORM parameterization, or route-level access controls covered them. That's the precision problem solved.

The webhook escalation moment is something we're particularly proud of. Pushing a new database connection file to the demo repo triggered an automatic re-scan, the model correctly identified that the previously-flagged input now reached a sensitive sink through the new file, and the severity updated from MEDIUM to CRITICAL in real time — with a full explanation of why the new file changed the blast radius. That's a reasoning call, not a pattern match.

What We Learned

The division of labor between deterministic tools and reasoning models is the real insight. Semgrep is excellent at what it does — fast, reliable, zero hallucination. K2-Think-V2 is excellent at what it does — reading context, tracing logic, making judgment calls. Trying to get a language model to do detection from scratch produces hallucinations. Trying to get Semgrep to reason about exploitability is impossible. The two-layer architecture lets each tool do only what it's best at, and the combination produces something neither could do alone.

We also learned that output verifiability is a forcing function for quality. Because every finding includes specific file and line citations, it was immediately obvious when the model was guessing versus reasoning. That constraint made the prompts much better.

What's Next for Piratech

The immediate next step is a VS Code extension and GitHub PR integration — so findings surface in the developer's existing workflow rather than requiring a context switch to a dashboard. After that, team dashboards with shared finding history and Slack alerts for escalations.

The larger roadmap is on-premise reasoning model deployment. The biggest enterprise objection to any AI security tool is that source code leaves the organization. On-prem removes that blocker entirely. We're also planning to expand language support beyond JavaScript and Python, and to add CI/CD pipeline gates that can block deploys when a CRITICAL finding is detected.

Built With

Share this project:

Updates