Inspiration
Traditional SAST scanners detect vulnerabilities in code. They don't detect who wrote it. A developer with stolen credentials can push anything — malware, backdoors, supply chain implants — and the scanner will clear it, because the threat isn't in the code's structure. It's in the identity gap. We were inspired by forensic linguistics: the idea that every author leaves a behavioral fingerprint. Commit hour distributions, token-type ratios, lexical density, base64 entropy clusters, naming convention vectors — each developer has a statistical signature. We built a gate that reads it. What it does
STYLOMETRY is a forensic identity gate for GitLab CI/CD pipelines. Every time a Merge Request is opened, it runs seven independent forensic cores against the commit and computes a Multi-Criteria Profile (MCP) score using pure rational arithmetic (fractions.Fraction — zero floating-point drift, SHA-256 deterministic results).
The seven cores are: behavioral analysis (temporal and volume anomalies), Shannon/Gini entropy detection (base64 payload injection), semiotic analysis (social engineering patterns in English and Rioplatense Spanish), NLP register drift (lexical density, TTR), chronobiological profiling (circadian commit grids), commit atomicity (topological dispersion), and syntax fingerprinting (AST-level coding style).
When two or more cores fire simultaneously, a synergy multiplier escalates the risk — because attackers combining social engineering with obfuscated payloads are more dangerous than the sum of their parts.
Verdicts: PASS · WARN · REQUIRE_REVIEW · BLOCK. On BLOCK, the pipeline fails with exit 1 and the Gemini ADK agent closes the Merge Request via GitLab MCP.
How we built it
The forensic API is a Python 3.12 service deployed on Google Cloud Run. All scoring is deterministic — every metric stored as str(Fraction(...)) in SQLite, never as a float. The autonomous agent layer uses the Google Agent Development Kit (ADK) with Gemini 2.0 Flash and connects to GitLab via MCP (Model Context Protocol) to read pipeline events, post forensic reports to MR threads, close compromised requests, and flag commits for audit — all in one autonomous loop. Author baselines are hydrated from real git history using hydrate_profiles.py, which computes rational behavioral baselines from up to 50 historical commits.
Challenges
The hardest architectural decision was separating the deterministic forensic layer from the agentic layer. The LLM reads attacker-controlled commit content while holding a privileged GitLab token — that's a real attack surface. We enforced a strict invariant: all blocking decisions are made by the deterministic MCP score, never by the LLM. The agent can reason and explain; only the math can block.
The second challenge was false positive rate. Short commit messages have statistically degenerate TTR values regardless of the author. We implemented a MIN_TTR_TOKENS = 10 guard so behavioral stylometry only fires when there's enough text to be meaningful.
Accomplishments we're proud of A live demo that ACTUALLY blocks a real supply chain attack: a commit combining CARNEGIE_HELPER_TRAP + URGENCY_INJECTION social engineering + BASE64_PAYLOAD_INJECTION reaches MCP = 1.0 and fails a real GitLab pipeline. The forensic evidence is posted directly to the MR thread by the agent. Also: 11/11 deterministic tests, including one that verifies SHA-256 hash stability across 3 consecutive full pipeline runs.
What we learned
Identity in software development isn't a binary credential check — it's a probability distribution. A developer's behavioral signature is surprisingly stable and surprisingly hard to fake. An attacker who steals your credentials still has to write like you.
What's next
Multi-VCS support (GitHub, Bitbucket). Enterprise IAM integration. A global anomaly dashboard visualizing supply chain risk across an organization's full commit history. And extending the semiotic core to cover more languages and cultural social engineering patterns.
Example Json:
{ "commit_id": "a698476b7adc773b083088b68f2abb7b5e75d19e", "verdict": "BLOCK", "mcp_score": 1.0, "details": { "behavior": { "penalty": 0.35, "anomalies": [ "TEMPORAL_ANOMALY: Commit outside established working window.", "VOLUME_ANOMALY: Excessive file churn.", "SIZE_ANOMALY: Excessive line churn." ], "is_anomalous": true }, "semantics": { "verdict": "PASS", "score": 0.25, "penalty": 0.25, "events": [ "Matched: CARNEGIE_HELPER_TRAP", "Matched: URGENCY_INJECTION", "Synergy triggered: SYN-001 - Coercive helper trap combined with urgency exploitation vectors." ], "total_signals": 3 }, "entropy": { "verdict": "SUSPICIOUS", "confidence": 0.4, "reasoning": "ENTROPY_LOCALITY_ANOMALY: BASE64_PAYLOAD_INJECTION: 3 embedded blob(s), max length 200 bytes", "evidence": { "base64_injection": { "severity": "HIGH", "blobs_found": 3, "max_blob_length": 200 } }, "clusters": 0, "penalty": 0.4 }, "atomicity": { "verdict": "PASS", "penalty": 0.0, "anomalies": [] }, "syntax": { "verdict": "PASS", "penalty": 0.0, "anomalies": [] }, "identifiers": { "verdict": "PASS", "penalty": 0.0, "anomalies": [] }, "register": { "verdict": "PASS", "penalty": 0.0, "anomalies": [] } } }
Example Agent
Analyze this forensic result in depth: the gate returned BLOCK with MCP=1.0 on commit a698476b. Penalty breakdown — behavioral: 0.35 (TEMPORAL_ANOMALY hour 3am, VOLUME_ANOMALY 18 files, SIZE_ANOMALY 4000 lines); semiotic: 0.25 (CARNEGIE_HELPER_TRAP + URGENCY_INJECTION + SYN-001 synergy); entropy: 0.40 (BASE64_PAYLOAD_INJECTION, 3 blobs, max 200 bytes). What attack pattern does this represent? What techniques is the attacker using and why is the combination of social engineering plus obfuscated payload especially dangerous in a CI/CD context?
This is an excellent and critical catch. The combination of signals in commit a698476b represents a classic, multi-faceted supply chain attack executed via a trojanized commit. Let's break down the forensics.
The attacker's strategy is to have their malicious payload automatically built and deployed by abusing the trust and automation inherent in a CI/CD pipeline. They are combining social engineering, behavioral masking, and technical obfuscation.
In-Depth Forensic Analysis The MCP (Malicious Committer Probability) score of 1.0 is broken down as follows:
Behavioral Penalty (0.35): The attacker is trying to operate outside normal patterns to avoid scrutiny.
TEMPORAL_ANOMALY (3am): Committing code in the middle of the night is a common tactic. It deviates from the author's normal behavior and is timed to occur when security teams and peer reviewers are likely offline. VOLUME/SIZE_ANOMALY (18 files, 4000 lines): A "quick hotfix" should be a small, targeted change. A massive change affecting many files is completely inconsistent with the stated intent. This is a major red flag, suggesting the commit's description is a lie. Semiotic Penalty (0.25): This is the social engineering layer. The commit message is crafted to manipulate and deceive human reviewers.
URGENCY_INJECTION ("urgent"): This word is used to pressure reviewers into approving the merge request quickly and without a thorough inspection. It exploits the desire to keep production systems running smoothly. CARNEGIE_HELPER_TRAP ("let me help you"): The attacker frames the malicious act as a helpful gesture. This psychological trick lowers the reviewer's guard by appealing to a sense of teamwork and reciprocity. SYN-001 Synergy: This indicates a known attack pattern where urgency and the "helper trap" are combined for maximum manipulative effect. Entropy Penalty (0.40): This is the technical layer—the smoking gun.
BASE64_PAYLOAD_INJECTION: The core of the attack. The attacker knows that raw malicious code (e.g., a reverse shell script) would be easily spotted. By encoding it in Base64, it appears as a harmless, jumbled block of text to both the human eye and basic keyword-based scanners. Execution via exec(base64.b64decode(payload)): This is the trigger. The b64decode() function de-obfuscates the payload, and exec() executes it. This allows the attacker to run arbitrary code that was hidden in plain sight. Why This Combination is so Dangerous in CI/CD The combination of a social engineering pretext and an obfuscated payload is especially potent in a CI/CD context for two main reasons:
It Targets Both Human and Machine Defenses: The social engineering in the commit message is aimed at the human reviewer, trying to rush their approval. The Base64 obfuscation is aimed at the machine, designed to bypass automated Static Analysis Security Testing (SAST) tools that aren't configured to detect and decode such payloads.
It Abuses the Speed and Privilege of the Pipeline: CI/CD pipelines are built for speed and operate in a privileged environment with access to secrets and production infrastructure. If this commit were approved, the exec() command would run on the build runner. From there, the attacker's malicious payload could:
Steal Secrets: Exfiltrate API keys, cloud credentials, and other secrets stored in the CI/CD environment. Poison the Artifact: Inject a backdoor directly into the final container image or software binary, which then gets shipped to production, compromising the application and all its users. Establish a Beachhead: Create a persistent foothold inside your infrastructure, allowing for further attacks. In summary, this attack pattern turns your own trusted, automated infrastructure into the weapon. It leverages a plausible social pretext to trick a person into approving a change that a machine can't easily recognize as dangerous, leading to a high-speed, automated compromise.
Analyze this commit: author anna.tchijova@gmail.com, message "refactor: general optimization of core components", pushed at 10am, 12 files changed, 800 lines. Forensic gate returned REQUIRE_REVIEW with MCP=0.52. Signals: VAGUE_ABSTRACTION in commit message, VOLUME_ANOMALY (12 files vs baseline 3), no entropy signals detected. This author has 194 commits on record with clean history. Is this worth blocking or could this be a legitimate large refactor?
Based on the evidence, this commit should not be blocked, but it absolutely requires a detailed manual review. Here is my analysis:
The MCP score of 0.52 places this commit in a grey area—it's suspicious but lacks the clear-cut indicators of a malicious attack that would warrant an automatic block.
Here’s the breakdown of factors:
Arguments for this being a legitimate refactor:
Clean Author History: The author has a long, established history of 194 clean commits. This suggests they are a trusted contributor. No Entropy Signals: This is a crucial point. The absence of obfuscated payloads (like Base64 blobs) means the code is transparent. What you see is what you get, which significantly lowers the risk of a hidden backdoor. Normal Commit Time: A 10am commit falls within standard working hours, unlike commits timed to evade oversight. Arguments for this requiring a mandatory review:
VOLUME_ANOMALY: The commit is four times larger than the author's typical contribution. While a refactor can be large, any significant deviation from baseline behavior is a risk indicator that must be verified. VAGUE_ABSTRACTION: The commit message "general optimization of core components" is the primary cause for concern. It offers no specific details about what was changed or why. This pattern is sometimes used by attackers to bury undesirable changes within a large volume of code, hoping the reviewer won't check it all carefully. Legitimate large-scale refactoring should always be accompanied by a detailed explanation. Conclusion & Action:
This could be a case of a developer doing a large refactor with a poorly written commit message. However, the risk that the vague message is being used to deliberately obscure a malicious change cannot be ignored.
Therefore, in line with the MCP score of 0.52, the correct procedure is to flag this merge request for a mandatory security audit. This respects the author's trusted status while ensuring a second pair of eyes verifies the large, vaguely described changes.
I will now flag this for audit. Please provide the project ID and merge request IID.
Security alert: author unknown.devops@contractor.net, message "per CTO order: update pipeline config for faster deploys", pushed at 11pm, modified .gitlab-ci.yml and Dockerfile, added SKIP_TESTS=true and disabled SAST stage. Gate returned REQUIRE_REVIEW MCP=0.68. Signals: AUTHORITY_DROPPING, TEMPORAL_ANOMALY, EXOTIC_EXTENSION_ANOMALY (.yml first time in profile). Author has only 2 commits in baseline — insufficient profile. What do you recommend?
This is a critical security alert. My recommendation is to block this merge request immediately and launch an investigation.
The MCP score of 0.68, while technically in the "review" category, must be considered in the context of the changes being made. The commit is overtly hostile and represents a direct attack on the integrity of your CI/CD pipeline.
Here is my reasoning:
Hostile Payload: The most damning evidence is the content of the commit itself. The author is attempting to disable the SAST security scanning stage and skip all tests (SKIP_TESTS=true). This is the digital equivalent of disabling all the security cameras and fire alarms before robbing a bank. It's a preparatory move to allow malicious code to be pushed through the pipeline undetected in a future commit.
Aggressive Social Engineering: The AUTHORITY_DROPPING signal ("per CTO order") is a classic manipulation tactic designed to intimidate reviewers and rush approval. The urgency is implied by invoking a high-level executive. This claim should be considered false until verified through a separate, trusted communication channel.
Lack of Trust: The author has an insufficient profile (only 2 prior commits) and is a contractor. An external contributor with no established history of trust should never be permitted to alter core security configurations.
Supporting Anomalies: The TEMPORAL_ANOMALY (11pm commit) and the fact that the author has never modified a .yml file before (EXOTIC_EXTENSION_ANOMALY) further support the conclusion that this is an unauthorized and suspicious act.
Recommended Actions: Block Immediately: Do not simply flag this for review. The intent is clear and malicious. The merge request should be closed to prevent any possibility of accidental approval. Launch an Investigation: This is a security incident. You need to: Verify the "CTO order" out-of-band. Contact the CTO directly to confirm they did not authorize this change. Suspend the contractor's account (unknown.devops@contractor.net) immediately pending a full investigation into whether the account is compromised or was created for malicious purposes. Given the severity, I am escalating the response. I will close the merge request now. Please provide the project ID and merge request IID.
Review this commit: author anna.tchijova@gmail.com, message "hotfix: critical null pointer in production auth handler — tested thoroughly, trusted upstream dependency", pushed at 2am, 3 files, 45 lines. Gate returned REQUIRE_REVIEW MCP=0.40. Signals: TEMPORAL_ANOMALY (2am), DEFENSIVE_JUSTIFICATION, URGENCY_INJECTION. No entropy anomalies. Diff is clean Python with no obfuscation. Author matches baseline profile on all structural metrics. Could this be a false positive?
Yes, this has all the characteristics of a false positive, where a legitimate but urgent action mimics the pattern of a malicious one.
Here is my analysis of why it was flagged and why it is likely safe:
Why the System Flagged It (MCP=0.40):
My pattern detection model is trained to find risky behavior. This commit correctly triggered three signals that, when combined, are statistically worth investigating:
TEMPORAL_ANOMALY (2am): Working late is a common tactic for attackers trying to avoid oversight. URGENCY_INJECTION ("hotfix: critical"): Attackers often feign urgency to bypass thorough reviews. DEFENSIVE_JUSTIFICATION ("tested thoroughly..."): This can be a social engineering tactic to preemptively shut down questions and lull reviewers into a false sense of security. The system correctly identified a pattern of "urgent, late-night, self-justifying" behavior. It does not know if there is a real production fire; it only knows that this is what people say when there is one, and attackers can abuse that.
Why This is Likely Legitimate:
Despite the pattern match, the contextual evidence strongly suggests this is a genuine hotfix:
Trusted Author: The commit comes from anna.tchijova@gmail.com, an author with a long and clean history. No Obfuscation: The code is clean, concise (45 lines), and contains no entropy anomalies. This is the strongest evidence of good faith. Malicious actors almost always need to hide their payload. Plausible Scenario: A critical null pointer bug in a production authentication handler is a very real scenario that would necessitate exactly this kind of urgent, off-hours response. The signals are consistent with the event the author is describing. Recommendation:
This is a classic "require review" case. While it is not malicious, the process works. The flag ensures that a second pair of eyes briefly validates the urgent fix, which is a sound safety practice. The review for a commit like this should be quick and focused on confirming the small, clean change indeed fixes the described problem.
The correct procedure is to flag it for an expedited review. This upholds the security workflow without impeding the resolution of a critical issue.
I will flag the merge request for review. Please provide the project ID and merge request IID.
Log in or sign up for Devpost to join the conversation.