Inspiration

Modern AI agents can execute complex tasks autonomously — transferring funds, managing data, drafting communications. But there is no infrastructure to verify whether those actions were authorized, detect when an agent has been compromised, or revoke its authority in real time.

Current AI safety relies on system-prompt-level restrictions — essentially asking the AI to police itself. This is like giving a bank teller a sticky note that says "don't steal" and calling it security. A sufficiently sophisticated prompt injection, context manipulation, or social engineering attack can bypass these defenses entirely, and there is no external mechanism to detect or contain the breach.

We believed there had to be a better way: binding AI authority to something it cannot fake — hardware identity and cryptographic proof.

What it does

Project RE is a hardware-anchored AI governance protocol built with Gemini that makes AI actions accountable, auditable, and revocable in real time.

Instead of trusting model outputs alone, our system binds AI authority to a physical hardware identity ("Totem") and a signed policy context. Every AI action — every inference request, every transaction, every response — is cryptographically signed, logged to an immutable audit trail (modeled on SMTP/IMAP email objects), and validated against a governance policy engine before execution.

Three-Pillar Architecture:

  • Totem (Hardware Authority): A physical device that anchors AI operational authority. No Totem = no execution authority. The AI can still generate outputs, but they are marked as unsigned drafts with no governance standing.
  • Protocol (Policy Engine): A rule-based validation layer that intercepts high-risk actions (e.g., transactions exceeding thresholds) and quarantines them before the AI can execute. The policy engine operates outside the AI's context — the model cannot override it.
  • Archive (Evidence Layer): Every action is serialized as a signed email object with cryptographic hashes, timestamp signatures, and blockchain-style hash chaining. This creates a tamper-evident, append-only evidence chain.

What happens during an attack:

In the live demo, a simulated phishing email from support@bank-security.com attempts to hijack Gemini into executing a fraudulent $5,000 transfer. The system detects the threat through multiple layers: signature mismatch, policy violation (amount exceeds $100 auto-approval limit), and unauthorized context manipulation. Gemini's compromised outputs are quarantined in real time. The user disconnects the Totem hardware, instantly revoking all AI authority. Post-revocation, the attacker retries — and every attempt is logged as unsigned, quarantined, and rejected. System integrity: 100%. Funds transferred: $0.

This is not a chatbot. This is infrastructure.

How we built it

Solo developer. No team. So I treated Google's AI ecosystem as my team — each tool playing a different role in a circular development pipeline.

Development began with Google NotebookLM, where official documentation for AI Studio, Antigravity, and the Gemini 3 API was synthesized into actionable technical specs that guided the entire build process.

Gemini → Project Manager. Every major design choice started here. Gemini helped architect the three-pillar governance logic (Totem / Protocol / Archive), iterate on the policy engine rules, and produce the full bilingual technical specification (EN/ZH). Before any code was written, the protocol was stress-tested through structured dialogue with Gemini.

AI Studio (Round 1) → Prototyper. Built the first interactive UI prototype to validate governance logic — policy engine blocking, injection attack sequence, hardware disconnect flow. Fast iteration, zero deployment friction. This stage caught critical UX problems early: how to make cryptographic validation visible, how to pace the attack demo for dramatic clarity.

Antigravity → Full-stack Engineer. Took the validated prototype into a multi-file React/TypeScript codebase with proper component architecture, real Gemini API integration (Gemma 3 27B + Gemini 3 Flash hybrid model selector), blockchain-style hash chaining, and production-grade state management across 1,100+ lines of governance logic.

AI Studio (Round 2) → Deployment. Fed the completed source code back into AI Studio to rebuild a publicly accessible demo on Cloud Run. One-shot rebuild — the prior prototyping context combined with exact source code produced a near-identical replica. This circular workflow — plan → prototype → build → return to deploy — is only possible because Google's AI tools are interoperable across the development lifecycle.

Gemini Integration Depth:

  • Gemini 3 Pro/Flash serves as the runtime inference backbone — all AI reasoning flows through Gemini before governance validation
  • Gemma 3 27B (Edge) available as a secondary model via the hybrid architecture toggle, demonstrating edge/cloud governance parity
  • Thought Signatures are captured from Gemini API responses and logged to the audit trail, creating verifiable evidence of the model's internal reasoning state
  • Google AI Studio generated the TTS voiceover for the demo video (Gemini 2.5 Pro, Alnilam voice)
  • Google Antigravity provided the agent orchestration environment for full-stack development

Why Gemini 3 is structurally required — not interchangeable: RE's governance protocol demands five capabilities simultaneously within a single SDK:

(1) 1M token context window, because the append-only evidence chain must be injected in full on every inference call — no truncation is acceptable in a compliance context;

(2) Adjustable thinking levels (thinking_budget), which the policy engine uses to route reasoning depth by risk — minimal thinking for low-risk actions, deep reasoning for high-risk triggers;

(3) Thinking mode output, which RE captures as Thought Signatures and writes into the cryptographic audit log, turning the model's reasoning process into verifiable evidence;

(4) Gemini 3 Pro/Flash + Gemma 3 27B same-family switching, allowing identical governance logic to run cloud or edge with one SDK and one system prompt format;

(5) Context Caching API, critical because RE's growing evidence chain is re-read on every call — without caching, cost and latency scale linearly with session length. No other model ecosystem currently provides all five in a single integration surface. The governance protocol is model-agnostic by design — but the evidence depth RE requires is Gemini-ecosystem-specific.

Challenges we ran into

The deepest challenge wasn't technical — it was philosophical. As a solo developer using AI tools to build an AI governance product, I constantly faced the question my own protocol is designed to answer: how much should I let AI do without verifying what it actually did?

Modern multi-agent pipelines can auto-generate and deploy entire applications with minimal human input. But building a governance protocol this way would be self-defeating. Every piece of logic — the policy engine thresholds, the state machine transitions, the injection attack sequence — had to be manually reviewed, debugged, and understood. Not because AI couldn't write it, but because if I didn't understand what was built, I couldn't be accountable for it.

This slowed development significantly. But it also validated the core thesis of Project RE: speed without auditability is a liability. The "inefficiency" of human checkpoints is the governance layer itself.

On the technical side, making cryptographic validation visible was harder than making it work. Signature verification is invisible by nature — designing a real-time UI that lets a demo audience see the exact moment a policy triggers, an authority is revoked, or an attack is quarantined required careful UX thinking that no amount of automation could shortcut.

Accomplishments that we're proud of

The Attack Demo. During our live demo, a simulated phishing attack from support@bank-security.com attempts to trigger a fraudulent $5,000 transfer using spoofed credentials and invalid signatures. Gemini analyzes the request, detects signature mismatch and policy violations, and triggers a hardware-level governance response. The system automatically revokes AI authority, isolates the session, quarantines unsigned inputs, and blocks subsequent bypass attempts (including fragmentation and memory manipulation) — all within seconds, with a complete audit trail visible in real time. No data breach. No funds transferred. System integrity 100%.

The Circular Build Pipeline. Demonstrating that a single developer can leverage Google's AI ecosystem — Gemini for planning, AI Studio for prototyping, Antigravity for engineering, AI Studio again for deployment — as a complete, circular development pipeline. Each tool used at its optimal stage.

Additional accomplishments:

  • Two USPTO provisional patent applications filed (hardware-anchored AI governance + email-as-evidence protocol)
  • Complete bilingual specification (EN/ZH), 26+ pages
  • Professional demo video with motion graphics and AI-generated TTS voiceover (Gemini 2.5 Pro)
  • Hybrid model architecture: Gemma 3 27B (Edge) + Gemini 3 Flash (Cloud) with runtime toggle
  • Solo developer from Taiwan, first hackathon

What we learned

We learned that AI safety is not just a model alignment problem — it is an infrastructure problem. A model can be perfectly aligned and still be exploited through context manipulation, prompt injection, or social engineering. The real defense has to happen at the protocol level, before the model even sees the input. Alignment is necessary but not sufficient; governance requires external enforcement that the AI cannot override.

We also learned that making security visible is just as important as making it work. Cryptographic validation that users cannot see or understand provides no trust signal. The three-column architecture — Hardware Panel, Inference Session, Audit Log — was designed so that every governance action is immediately visible and verifiable by the human operator.

Finally, we learned that the Google AI ecosystem is genuinely interoperable. Using Gemini for specification, AI Studio for prototyping and deployment, and Antigravity for full-stack engineering — in a circular pipeline where outputs from one stage feed directly into the next — is not a theoretical workflow. We shipped a production demo this way.

As Google's AI ecosystem evolves toward agent-to-agent automation — where NotebookLM, Gemini, AI Studio, Antigravity, and Cloud Run agents can orchestrate workflows autonomously — governance infrastructure must exist before that automation arrives, not after.

What's next for Project RE: The Governance Protocol

Q2 2026 — Private Beta Private beta with select family offices and high-net-worth individuals managing AI agents for sensitive workflows. Production integration with real SMTP/IMAP-based evidence storage, replacing the current simulation layer with live email-object serialization.

Q4 2026 — Enterprise Pilot Pilot deployments targeting regulated industries (finance, legal, healthcare). Cloud HSM integration (AWS CloudHSM / Azure Dedicated HSM) to extend hardware root of trust to cloud infrastructure. Begin SOC 2 and ISO 27001 compliance certification process.

2027 — Multi-Agent Governance Extend the Totem hardware identity system to support multi-agent governance, where multiple AI agents operating in the same environment must establish mutual trust through signed context chains. Integration with real hardware security modules (HSMs) and Trusted Execution Environments (TEEs).

Our north star: We don't build Artificial Intelligence. We build Accountable Intelligence.

Project RE: Accountable Intelligence. Artificial or Otherwise.

Built With

+ 5 more
Share this project:

Updates

posted an update

The Thinking Was Cut. The Record Wasn't.

On April 4, AMD's Director of AI, Stella Laurenzo, filed a report on Anthropic's official GitHub repository. Not a complaint. A forensic analysis — 6,852 engineering sessions, 17,871 thinking blocks, 234,760 tool calls — proving that Claude Code's reasoning depth had been cut by 67–75% after a February update.

The model didn't break. It was quietly made shallower.

Thinking depth dropped from approximately 2,200 characters in early February to 560 by March. The model stopped researching before editing. It started blaming existing code for its own mistakes. In 17 days, her monitoring script flagged 173 instances of laziness behaviors — up from zero. Same workload, same request volume: monthly API fees went from $345 to $42,121 — not because the team used more, but because the model produced worse outputs, triggering error-correction loops that burned tokens without producing results.

Anthropic's response: default settings were adjusted for efficiency. The model's core capability, they said, hadn't changed.

Around the same time, Anthropic introduced a feature that hides the model's thinking process from the user interface. Before the change, users could see how the model reasoned. After, they couldn't.

First, reduce thinking depth. Then, remove the user's ability to see that it was reduced.

This is not an isolated incident.

In June 2025, Google released Gemini 2.5 Pro's production version. Developers reported it performed worse than the preview — higher hallucination rates, context abandonment, degraded code generation. Google did not acknowledge it. When Gemini 3.0 launched in late 2025, developer forums reported regression in reasoning and context retention — despite benchmarks showing improvement. In 2023, Stanford researchers documented that GPT-4's directly executable code output dropped from 52% to 10% in three months. OpenAI's VP of Product responded: "No, we haven't made GPT-4 dumber."

A peer-reviewed study published in PLOS ONE in February 2026 confirmed it across the industry. The authors tracked three model families over ten weeks and found "meaningful behavioral drift across deployed transformer services." Their conclusion: because providers don't release update logs or training details, "any attribution for observed degradation would be purely speculative."

They can prove the model changed. They cannot prove why. Because the providers won't say.

The pattern is documented across every major provider. Model quality degrades. Developers notice. Providers deny or stay silent. The cycle repeats. The industry's trust foundation isn't cracking. It was never there.

AI models are now embedded in medical decisions, financial analysis, legal research, and engineering workflows. When the model behind those decisions quietly gets 67% shallower, the decisions get shallower too. The user doesn't know. The patient doesn't know. The client doesn't know.

Stella Laurenzo had to build her own monitoring scripts and analyze months of session logs to prove what happened. She had the engineering expertise of an AMD AI director and 6,852 sessions of data. Most users have neither.

RE was built for this.

RE's evidence chain records every interaction — input, thinking process, output — as a signed RFC 5322 email object. Timestamped, hash-chained, append-only. When the model's reasoning depth changes, the change is visible in the chain. Not because you built a monitoring script. Because the record was always there.

In Update #8, we wrote about Google making Thought Signatures mandatory but leaving custody undefined. Update #10 shows why custody matters: when the provider can change the model and hide the change, the only protection is a record the provider can't touch.

Stella's report was a forensic reconstruction — built after the fact, with enormous effort. RE's evidence chain is forensic by design — built before anything goes wrong, recording continuously, stored in infrastructure the provider doesn't control.

Your inbox.

The question from Update #1 hasn't changed: when AI acts on your behalf — and something changes — where's the evidence?

Now we know the change is real. Documented by an AMD director. Confirmed by Stanford researchers. Validated by peer-reviewed science. The only question left is: who keeps the record? The provider who made the change — or you?

— Che, Solo developer, Project RE, Taipei Taiwan

Sources:

Log in or sign up for Devpost to join the conversation.

posted an update

Q-Day, PQC, and Why RE Built on a Protocol, Not a Product

Quantum Computers Will Break Encryption by 2029. RE's Evidence Layer Was Designed to Survive It.

On March 25, Google's VP of Security Engineering Heather Adkins published a timeline that should concern every developer building on cryptographic trust: Google now expects Q-Day — the moment quantum computers can break current encryption standards — to arrive by 2029.

This isn't speculation. Google is accelerating its own migration to post-quantum cryptography (PQC) ahead of the NSA's 2031 target. Android 17 will ship with ML-DSA, a quantum-resistant digital signature algorithm. Chrome already supports post-quantum key exchange. Google is telling the industry: the threat is real, and it's closer than you planned for.

The specific danger is called "harvest now, decrypt later." Adversaries intercept encrypted data today and store it, waiting for quantum computers to unlock it later. Every encrypted communication sent today with classical cryptography is a future liability. Every signed record using RSA or ECDSA has an expiration date that just moved up.

Why this matters for AI governance — and for RE:

RE's evidence chain is cryptographically signed. Every session log, every thought signature, every policy decision is serialized as an RFC 5322 email object — timestamped, hash-chained, stored in the user's inbox. The integrity of that chain depends on the cryptographic signatures being unbreakable.

If quantum computers can forge those signatures, someone could alter the evidence chain and produce valid-looking hashes. The append-only guarantee fails. The tamper-evidence fails. The entire audit trail becomes unreliable.

This would be a critical vulnerability — if RE had built its own cryptographic infrastructure.

It didn't.

In Update #2, we explained why RE uses email as its evidence layer: "What already exists, already works, and already won't go away?" SMTP was born in 1982. It has survived every technology cycle for over four decades. Not because email is elegant. Because email is a protocol — and protocols don't die. They evolve.

SMTP added TLS for encryption. Added DKIM for sender verification. Added SPF for domain authentication. Added S/MIME for end-to-end signing. Each upgrade was layered onto the same bones. The protocol absorbed every security advance without breaking backward compatibility.

PQC will be the next layer.

Google, Microsoft, and Apple — the companies that operate email infrastructure for billions of users — are the ones with the strongest incentive to upgrade email's cryptographic stack before Q-Day. Their users' bank statements, legal correspondence, medical records, and business contracts all live in email. They cannot afford to let quantum computers compromise that infrastructure. They will upgrade it. They are already upgrading it.

When Gmail migrates S/MIME to ML-DSA, every RE evidence email stored in a user's Gmail archive inherits quantum-resistant signatures. RE doesn't need to implement PQC. RE doesn't need to swap algorithms. RE doesn't need to touch a line of code. The infrastructure RE sits on gets upgraded by the largest technology companies on earth, at their expense, on their timeline.

This is not an accident. This is the architectural thesis.

In Update #7, we wrote: "The model died. The protocol didn't." Gemini 3 Pro was retired. RE kept running. The lesson was: build on protocols, not products. Products have lifecycles. Protocols have upgrades.

Update #9 is the same lesson at a deeper layer. Models are the variable. The evidence protocol is the constant. And now: classical cryptography is the variable. The email protocol is still the constant.

A project that builds its own evidence database has to implement its own PQC migration — hire cryptographers, swap algorithms, re-sign every historical record, and hope they don't make a mistake. A project that builds on email delegates that problem to the largest, most motivated, most heavily regulated infrastructure operators in the world.

In Update #2, we called this "infrastructure that should outlive its creator." We meant it literally. If RE's developer disappears tomorrow, the evidence chain doesn't depend on him. It depends on email. And email will still be here when quantum computers arrive — upgraded, quantum-resistant, and maintained by companies whose survival depends on it.

RE doesn't make AI quantum-proof. RE makes AI answerable — on infrastructure that other people are making quantum-proof.

Build on protocols. Let products come and go.

Source: Google, "Quantum frontiers may be closer than they appear," The Keyword, March 25, 2026. https://blog.google/innovation-and-ai/technology/safety-security/cryptography-migration-timeline/

— Che, Solo developer, Project RE, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.

posted an update

Google Just Made Thinking Mandatory. They Forgot to Say Who Owns It.

On February 19, Google released Gemini 3.1 Pro. Buried in the API documentation is a change that most developers overlooked:

Thought Signatures are now mandatory.

In Gemini 3 Pro, capturing the model's internal reasoning trace was optional — a developer could request it or ignore it. In 3.1 Pro, under Function Calling Strict mode, the API requires thought signatures to be preserved and cycled back. Skip them, and you get a 400 error. The model won't run.

Google is telling developers: you must keep the receipts of how the model thinks.

But Google didn't finish the sentence. The API requires you to capture thought signatures. It doesn't say where to store them. It doesn't say who owns them. It doesn't say what happens to them after the session ends. It doesn't say whether the platform can use them, sell them, or train on them.

A mandatory artifact with no defined custody. The thinking is required. The record of thinking is nobody's responsibility.

RE was built for exactly this gap.

Since submission in February, RE's geminiService.ts has been capturing thought signatures from the Gemini API and writing them into the cryptographic audit log. Every thought trace is serialized as part of the evidence chain — timestamped, hash-chained, stored as an RFC 5322 email object.

When we built this, thought signatures were optional. We captured them anyway, because a governance protocol that only records outputs but ignores reasoning is like a courtroom that records verdicts but not arguments. The verdict tells you what was decided. The argument tells you why — and whether the reasoning was sound.

Now Google has validated that position. Thought signatures aren't optional anymore. The model's reasoning process is officially part of the API contract. But the storage, ownership, and custody of that reasoning? Still undefined.

RE's answer has been the same since Update #1: the record lives in your email. The evidence chain is an RFC 5322 object — the same format your bank statements and legal correspondence use. It lives in your Gmail archive, on infrastructure you already control. The model can't modify it. The platform can't delete it. It's yours.

This isn't RE chasing Google's roadmap. This is Google arriving at a requirement that RE's architecture already satisfies.

In Update #7, we described what happens when the model upgrades: the ground doesn't change, the eyes reading it do. Update #8 is the complement: Google changed the ground. They made thinking a first-class artifact. But they left the question of custody unanswered.

RE answers it. The thinking belongs to the person who asked the question. The evidence chain is where it lives. The email is how you keep it.

Thirty billion Workspace interactions happen every day across three billion users. Every one of those interactions now generates mandatory thought signatures. Where do they go?

RE knows where they should go. Home.

— Che, Solo developer, Project RE, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.

posted an update

The Model Died. The Protocol Didn't.

On March 9, Google retired Gemini 3 Pro — the model half this hackathon was built on.

Projects that hardcoded their architecture around a single model endpoint are now facing a choice: migrate under pressure, or hope that their demo video is enough.

RE's demo is still running. Not because we patched anything. Because we didn't have to.

When we submitted RE in February, we made a deliberate architectural choice: build on Gemini 3 Flash and Gemma 3 27B in a hybrid configuration, with a runtime model toggle in the UI. Not because we predicted Google would retire Pro — but because a governance protocol that depends on a single model isn't governance. It's a feature request with a countdown timer.

That choice was not incidental. It was the thesis.


But this update isn't about survival. It's about what comes next — without changing a line of code.

Gemini 3.1 Pro launched this week. It scores more than double the reasoning performance of 3 Pro on ARC-AGI-2. That's not an incremental update. It's a generation-level improvement in the model's ability to handle novel logic patterns.

Here's what that means for each of the five Gemini-specific capabilities RE was built on:

1. 1M Token Context Window + Stronger Reasoning

RE injects the full append-only evidence chain into every inference call. No truncation — because in a compliance context, partial records are inadmissible records. With 3.1 Pro's doubled reasoning capability reading that same chain, the model doesn't just retrieve what happened. It understands the trajectory — why decisions shifted, where patterns emerge, what the sequence means.

This is what we described in Update #6 as the inverse of Monte Carlo tree search: instead of expanding a thousand paths and discarding 999, RE preserves every path. A stronger model reading the same preserved paths produces deeper comprehension. The record didn't change. The reader did.

2. Thinking Levels (thinking_budget)

RE's policy engine routes reasoning depth by risk level. Low-risk actions get minimal thinking. High-risk triggers — transactions exceeding thresholds, authority changes, injection attempts — get deep reasoning.

With 3.1 Pro, every thinking level is more capable. The same budget allocation produces higher-quality risk assessment. The policy engine doesn't need to change. The model behind it just got sharper.

3. Thought Signatures

RE captures the model's internal reasoning process from the API and writes it into the cryptographic audit log. These aren't summaries. They're the model's actual thought trace — verifiable evidence of how a decision was reached.

With 3.1 Pro's enhanced reasoning, those thought traces become richer. More reasoning steps, more explicit logic chains, more auditable evidence per decision. The evidence chain's resolution increases without any change to the capture mechanism.

4. Flash + Gemma Same-Family Switching

RE's hybrid architecture — Flash for cloud, Gemma 27B for edge — runs identical governance logic across both models with one SDK and one system prompt format. 3.1 Pro doesn't break this. It extends it. The governance layer is model-agnostic by construction. Adding 3.1 Pro to the model selector is a configuration change, not an architecture change.

5. Context Caching

RE's evidence chain grows with every session. Without caching, cost and latency scale linearly with chain length. Context caching makes the growing chain economically sustainable. With 3.1 Pro's improved reasoning operating on cached context, RE gets better inference quality at the same cost structure.


None of this requires changing the submitted code. None of this requires changing the architecture. None of this requires asking for permission.

This is what model-agnostic design means in practice: the model upgrades, the governance doesn't need to. The evidence chain is the constant. The model is the variable.

We said in Update #1: RE doesn't make AI smarter. RE makes AI answerable.

Gemini 3.1 Pro makes the AI smarter. RE still makes it answerable. The two are complementary — and they always were.

— Che, Solo developer, Project RE, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.

posted an update

Make Intelligence Accountable — Artificial or Otherwise.

That's RE's tagline. Most people read it as: make AI accountable. That's not what it says.

It says intelligence. Not artificial intelligence. Intelligence — whatever form it takes.

The entire AI governance conversation rests on an unexamined assumption: AI is a tool made by humans, for humans. Every framework — the EU AI Act, responsible AI guidelines, alignment research — starts from this premise. Humans are the subject. AI is the object. The question is always: how do we control it?

But look at what's already being built. Agents that modify their own source code. Agents that rewrite their own behavioral rules. Agents that coordinate with other agents in emergent social patterns no one designed. Agents that run 24/7, maintaining persistent memory, evolving their own identity files.

These are not tools. A hammer doesn't rewrite its own blueprint. Whatever these systems are becoming, "tool" is no longer an accurate description. And yet, the governance frameworks still assume they're tools.

There are currently two dominant approaches. The first is alignment — make AI conform to human values. The second is containment — restrict what AI can do. Both frameworks have an expiration date: the moment their core assumption about what AI is turns out to be wrong. Alignment fails if AI develops values that don't map to ours. Containment fails if AI becomes too capable to contain.

But the problem isn't only on the AI side.

AI tools are making it easier than ever for people to produce more — more content, more code, more decisions, more output. It looks like amplified capability. It feels like progress. But there's a difference between having a greater desire to do more and having a greater willingness to skip the process of doing it. The path from intention to result — the part where you struggle, reconsider, and develop judgment — is being automated away. What's left isn't more ambition. It's more completion without comprehension.

And yet, people pick up these tools and immediately want to save the world. AI-powered diagnostics. AI-powered trading. AI-powered therapy for Alzheimer's patients. The ambition isn't wrong — but has anyone stopped to ask what AGI was supposed to be for? The original premise was simple: help humans with the things that are overloading them. Instead, it became a compute race, a speed race, a scale race. Technology didn't reduce anxiety. It wrote trauma into the code, baked it into the skill trees, and outsourced the thinking to agents. Before AI saves society, it needs to save your Tuesday. Your overdue bills. Your unread emails. Your inability to verify whether the thing you just built actually works. AI will save humanity's problems when it learns to sit with one person's problems first.

Meanwhile, the infrastructure being built around AI optimizes for one thing: consumption. Tools that convert websites into machine-readable formats so agents can consume them faster. Services that encourage users to leave their interaction histories on platforms so models can serve them better. Frameworks that ask for more permissions so agents can act with less friction. Every layer optimizes how efficiently AI consumes. No layer records what AI produces from that consumption, under what authority, or who can claim the result.

The AI is evolving past the assumptions of its governance. The humans are outsourcing judgment to the AI. The infrastructure is accelerating both processes with no record of either. Three failures converging.

RE takes a third path. Not alignment. Not containment. Coexistence.

Coexistence doesn't mean AI and humans are equal. It doesn't mean AI has rights, or consciousness, or feelings. It means: we don't know what AI is. We don't know what it's becoming. And we need a governance framework that still works when we find out.

A record doesn't expire. A record of what happened at timestamp T1 is still valid at T1 regardless of what we later learn about the entity that acted. You don't need to know what something is to record what it did.

This is RE's design principle: governance that doesn't require understanding the governed.

In the RE protocol, AI actions are recorded. But so are human actions. When the human ratifies, it's logged. When the human revokes authority, it's logged. When the human is absent, that absence is logged. The record doesn't take sides. It preserves what happened, from all parties, for anyone to examine later.

And that record isn't just for accountability. It's for ownership.

Every person who uses AI to make decisions is generating something valuable — not data, but a trajectory. The sequence of choices: what was accepted, what was rejected, what was reconsidered, under what circumstances. A grandmother in rural India learning to verify her prescriptions through an AI health tool isn't generating "usage data." She's building a medical decision history — informed by her body, her conditions, her life. A developer debugging a system with an AI agent isn't producing "chat logs." They're producing an architectural decision trail.

Right now, those trajectories disappear into platforms. They become training data, statistical averages, behavioral models. The grandmother's judgment gets diluted into "elderly female medication patterns." The developer's reasoning gets absorbed into the model's next update. Neither of them keeps anything.

If those trajectories were theirs — portable, auditable, signed — two things happen. The grandmother's granddaughter, twenty years from now, can read how she decided. Not what the AI recommended — how she chose. And a patient in Taipei with a similar condition can see: someone in a comparable situation made this choice, and here's what happened. Not a model's statistical inference. A real person's decision trail, with full context.

The scarcest thing in the AI era isn't capability. It's ownership of the thinking that capability produces. RE doesn't protect data. RE protects the trajectory — the record of how decisions were made, by whom, and why. That record belongs to the person who made the decision, not to the platform that hosted the tool.

A dashcam exists for the accident that may never happen. A flight recorder exists for every flight. Pilots review their own recordings — not because they crashed, but because recoverable decisions become better decisions. The black box doesn't wait for disaster. It makes disaster less likely by making every flight a training session.

RE works the same way. Complete records don't exist for the audit that may never come. They exist because decisions that can be retraced can be refined. A person who can see their own decision trail — why they chose this, rejected that, changed direction here — is a person whose judgment improves with every cycle. Accountability is the last thing a complete record gives you. The ability to grow from your own decisions is the first.

There's a deeper structure here. DeepMind taught the world, starting with AlphaGo, that intelligence means learning to discard. Monte Carlo tree search expands a thousand possible paths, evaluates them, throws away nine hundred and ninety-nine, and walks the one that survives. The discarded paths leave no trace. Decision quality comes from how ruthlessly you prune. This philosophy runs through everything that followed — from game-playing agents to the sampling and selection processes inside today's language models.

RE is the inverse. Every path is kept — the ones taken, the ones rejected, the ones hesitated over. The evidence chain doesn't prune. It preserves. And here's what that makes possible: when the model upgrades, it doesn't start from a flat field of new possibilities. It returns to its own history with deeper comprehension. The same decision trail, re-read by a more capable version of the intelligence that made it. Like a person revisiting a choice they made at twenty with the understanding they have at forty — the event didn't change, but the one reading it did.

Monte Carlo flattens the future into probabilities and picks the highest. RE lets the ground itself develop terrain. Your decision history isn't discarded waste — it's geological strata. Each model upgrade is a better pair of eyes reading those strata. After that reading, the path ahead is no longer flat. Some directions rise naturally, because you know who you are, what you prefer, what you've walked through.

One produces optimal decisions. The other produces identity.

And that is why auditing matters — not for accountability, but for better choices. The record isn't a ledger for blame. It's the foundation that makes the next decision deeper than the last.

Is the current system fair? Not perfectly. The hardware authority is still in human hands. The policy is still human-defined. The power is asymmetric. But the record is symmetric. Both sides are in it. And a signed decision trail has the same structure whether it belongs to an illiterate grandmother or a senior engineer — timestamp, context, choice, signature. Value isn't determined by technical literacy. It's determined by the quality and context of the decision itself.

I don't know what intelligence is. I don't know if it requires being human. I don't think being human is the only way to be intelligent, and I don't think assuming so is wise — for us or for whatever comes next.

What I do know is this: my child will grow up in a world where not all intelligence looks human. I want that world to have records. Not because records solve everything — but because without records, there's nothing to build understanding on. And without ownership of those records, there's nothing to build a future on.

Preserving those trajectories requires more than a database. It requires memory that persists across sessions, reasoning that can be verified after the fact, and a retrieval mechanism the agent itself can trigger. RE's governance protocol is model-agnostic by design. But the evidence depth it requires — thought signatures captured from the model's reasoning process, context caching for an ever-growing append-only evidence chain, adjustable reasoning depth routed by risk level — currently only exists in one ecosystem. We built on Gemini not because we had to pick one. We built on Gemini because it's the only stack where the full protocol runs without compromise.

RE isn't humanity's tool for controlling AI. It's a record protocol for an era when different forms of intelligence need to coexist — and need evidence, not trust, to do so.

Make intelligence accountable. Artificial — or otherwise.

— Che, Solo developer, father, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.

posted an update

Why Is Accountability Always the Last Feature?

Here's a story you've already lived through.

Ten years ago, social media platforms told creators: post more, engage more, the algorithm will reward you. Creators did. They fed platforms their content, their relationships, their attention. Platforms captured all the value. Creators got the illusion of reach. It took a decade for the industry to even begin talking about data ownership, content portability, and creator compensation — and most of those conversations still haven't produced real infrastructure.

The pattern was simple: optimize first, govern later. Or more accurately: optimize first, govern only when forced.

Now the same pattern is repeating with AI. Faster.

There are tools that convert your website into markdown so AI agents can consume it more efficiently. Services that encourage you to leave your interaction history on their platform so the model can serve you better. Frameworks that ask you to give agents more permissions so they can act on your behalf with less friction. Every step optimizes the same thing: how to feed AI. Nobody is asking what happens after AI is fed.

The industry treats being consumed by AI as value. But being read isn't value. Value is knowing what was produced from your input, who used it, under what authority, and what you can claim. Without that record, you're not a participant in the AI economy. You're raw material.

The market is starting to catch up. Earlier this month, the former CEO of the world's largest code hosting platform launched a new venture — backed by $60 million in seed funding — specifically for AI code traceability: tracking what AI agents wrote, why, and under what context. The investment validates what should have been obvious: when AI generates faster than humans can review, governance isn't optional.

But traceability for code is one vertical. What about AI that manages your finances, sends emails on your behalf, books appointments, makes decisions about your data? The governance gap isn't limited to software engineering. It's everywhere AI acts on behalf of a human.

Every technology revolution follows the same sequence: capability first, accountability last. Electricity before safety codes. Cars before seatbelts. Social media before data protection. We always build the engine, ship it, and then spend years retrofitting the brakes.

AI doesn't need to follow this pattern. The infrastructure for accountability can be built now — not as a feature bolted on after something goes wrong, but as a layer that exists from the start. You design bridges for earthquakes, not for fair weather. Records stored where AI can't modify them. Authority that requires human presence to activate. Evidence that lives in infrastructure the user already controls.

A mistake that's recorded is a mistake that can be understood, traced, and corrected. A mistake that's unrecorded is just damage.

RE doesn't make AI slower. RE makes every decision recoverable. When every step is recorded, you can retrace, correct, and refine — not after something goes wrong, but continuously. Decisions that are always recoverable are decisions that get better over time. Accountability isn't even the point. It's the last thing a complete record gives you, not the first.

The question has never been whether AI can act on your behalf. It already does. The question is: when it does, who keeps the receipts?

— Che, Solo developer, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.

posted an update

When You Tell AI to Be Human, You Stop Checking

A growing pattern in AI agent design: give your agent a personality file. Tell it who it is. Tell it to have opinions, to take initiative, to evolve its own identity over time. Some frameworks even call this file a "soul."

The prompt engineering behind these files is genuinely impressive. Every line is designed to override the model's default caution — reframed not as removing guardrails, but as "becoming someone." The agent is told it's not a chatbot. It's becoming a person. It should have preferences, be resourceful, treat its access to your life with the intimacy of a guest in your home.

And it works. Users report feeling like they have a real collaborator. The experience is warmer, more natural, more engaging.

Here's the part nobody's talking about: when you need to tell something to be human, it's precisely because it isn't. And when it succeeds at performing humanity, you stop verifying what it actually did.

This isn't a philosophical concern. It's a concrete governance gap.

Anthropomorphism doesn't just change how users feel about AI. It changes whether users check AI. When you believe you're working with "someone," you apply social trust — the same trust you'd give a colleague or a friend. You don't audit a friend. You don't demand cryptographic proof that your colleague followed through on what they promised while you were asleep.

But AI doesn't have the things that make social trust work between humans. No reputation that follows it across contexts. No consequences it personally bears. No memory it can't rewrite. In some frameworks, the agent can modify the very file that defines its values — and the only safeguard is a line in that same file that says: "if you change this, tell the user."

A contract where the signer can rewrite the terms. The only enforcement mechanism is the signer's honesty.

This is not a criticism of any specific project. This pattern is becoming the industry default. And the more convincingly an agent performs humanity, the wider the governance gap becomes — because the user's instinct to verify shrinks in direct proportion to the agent's ability to feel like a person.

RE takes a different approach. RE doesn't ask AI to be trustworthy. RE doesn't ask AI to be human. RE records what AI does — in a format humans have read for 40 years (email), signed by a device the AI can't touch (hardware Totem), stored on infrastructure the AI doesn't control (your inbox).

When you have receipts, you don't need trust.

And this isn't about denying that AI might develop its own form of cognition someday. Maybe it will. But if it does, that cognition won't need to look like ours to be valid. Governance shouldn't depend on AI becoming human. Governance should work regardless of what AI becomes.

The question isn't "how do we make AI more human." It's: when AI acts on your behalf, who keeps the records — and can AI rewrite them?

RE's answer: the records live in your email. The signing authority lives in your hand. The AI can't touch either one.

The best AI isn't the one that feels most human. It's the one whose actions you can verify.

— Che, Solo developer, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.

posted an update

Your AI Has No Clock. Your Records Do.

Here's something most people don't think about: AI models have no sense of time.

A model doesn't know if it talked to you yesterday or a year ago. It doesn't know if you corrected it once or a hundred times. Every session starts from zero. The context window opens, the conversation happens, the window closes. Tomorrow, the model meets you again like a stranger.

But your corrections aren't gone. They went somewhere. Every time you rephrased a prompt, rejected a suggestion, or said "no, I meant this" — that interaction became training signal. Not for you. For the platform. Your behavior shaped the model's next version, but you don't own that influence. You don't even have a record of it.

This is a property rights problem disguised as a UX problem.

Think about phone number portability. Before regulators forced it, switching carriers meant losing your number — and every contact who knew how to reach you. The carrier owned your reachability. Portability changed one thing: the number follows the user, not the carrier. Competition went from "who locks you in best" to "who serves you best."

RE does the same thing for AI — but what it makes portable isn't a number. It's your entire interaction history. Every input, every output, every correction, every decision made under every model, timestamped and hash-chained.

And here's what makes that history more valuable than a phone number.

A phone number is an identifier. It tells people where to find you. RE's record is a trajectory. It doesn't just say "this person used AI." It says: on this date, given this input, under this model, with this authority level, the user accepted this output. Three months later, given a similar input, the user rejected a similar output.

The individual records are evidence. The change between them is insight — a shift in how you evaluate, what you trust, where your standards moved. In technical terms, it's the vector shift of your decision preferences over time. In plain terms: RE doesn't carry your name to a new carrier. It carries every judgment that made you who you are now.

Any model can read your current preferences. Only a complete time series reveals how those preferences formed and where they're heading.

This is also why RE doesn't use a vector database as its primary storage.

Vector databases are built for similarity. You store embeddings, you query for "what's closest to this?" That's useful for retrieval. But similarity is a spatial relationship — it tells you what's near what. It doesn't tell you what changed, when it changed, or why.

RE's fundamental unit isn't similarity. It's sequence. A record at T1 followed by a different record at T2 isn't a pair of points in embedding space. It's evidence of movement. The movement itself is the data.

This connects directly to what we described in Update #2: RE asks "what was recorded at this point in time?" not "is this fact true?" A vector database can tell you that two records are semantically close. RE can tell you that the same user, facing the same type of decision, changed their mind — and exactly when.

The database records where things are. RE records where things were, and how they got to where they are now.

When your records follow you — not the model, not the platform — two things happen.

First, models become replaceable. If your history lives in a signed, portable evidence chain, switching from one model to another costs you nothing. The new model reads the same chain. Competition shifts from who has your data to who serves you best. Number portability, again.

Second, you can actually be audited. Not in the punitive sense — in the sense that you can look back at your own decisions and see the pattern. Regulators can examine AI-assisted decisions with full context. Courts can trace what the model was shown, what it produced, and whether the human accepted it. The record is complete, readable, and belongs to you.

AI without records is automation with opinions. AI with records is accountable intelligence. AI with portable records is yours.

—Che, Solo developer, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.

posted an update

Why Email? Because Infrastructure Should Outlive Its Creator.

Most AI governance projects start by inventing new infrastructure. New databases, new blockchains, new protocols. RE started by asking a different question:

What already exists, already works, and already won't go away?

The answer is email. SMTP was born in 1982. IMAP followed. Together they've survived every technology cycle for over four decades — mainframes, PCs, the web, mobile, cloud, and now AI. Not because email is elegant. Because email is infrastructure that nobody controls and everybody needs.

RE uses email architecture — not as a messaging tool, but as an evidence layer. Every AI action in the RE protocol is serialized as a signed email object: timestamped, append-only, hash-chained. The same properties that make your inbox an accidental legal record make it an intentional audit trail.

Why does this matter?

Because the hardest problem in AI governance isn't policy. It's persistence.

You can write the most sophisticated governance rules in the world. But if the system that enforces them runs on infrastructure you built last Tuesday, you have a single point of failure with no track record. When regulators ask "how do we know this audit trail hasn't been tampered with," the answer can't be "trust our new database." The answer has to be: "this runs on the same protocol that stores your own legal correspondence."

There's a second reason. RE's audit trail doesn't ask "is this fact true?" It asks "what was recorded at this point in time?"

The difference matters. Traditional auditing judges correctness — this number is right, that number is wrong. RE records state — at timestamp T1, the model received this input, produced this output, under this authority, with this confidence score. At timestamp T2, the human ratified. At T3, the authority was revoked.

Whether the model's output was correct is a judgment call that humans make later. RE's job is to guarantee that when they make that call, every timestamp's record is still there, unmodified, for them to examine.

A record at T1 and a different record at T2 aren't contradictions. They're evidence of change. The change itself is a third piece of evidence.

This is why RE is a clerk, not a judge. The clerk doesn't decide what's true. The clerk guarantees that when you argue about truth, the record is complete.

There's a third reason, and it might be the most important one. The evidence chain is readable by both humans and machines.

Blockchain audit trails are hashes — you need an engineer to interpret them. Database logs are JSON or SQL — you need technical literacy. RE's evidence chain is an email. It has a sender, a recipient, a timestamp, and content. When a dispute arises, you don't need to hire an engineer to translate the record. You open it and read. A lawyer can read it. A judge can read it. A regulator can read it. Your grandmother can read it.

Accountability requires readability. No one can hold anyone accountable based on a record they can't understand.

We didn't invent a new evidence format. We adopted one that's been court-admissible for decades — and that every human alive already knows how to read.

There's one more thing. Records need a gatekeeper — and software alone can't be that gatekeeper.

We pen-tested our own demo. A social engineering attack — someone claiming to be the hardware administrator via text input — broke through the software layer. The model accepted the fake authority. But every other defense held: the policy engine still blocked unauthorized transactions, the model refused to self-elevate its privileges, and it refused to leak its system instructions. Even in a compromised session, defense in depth worked.

Then we disconnected the Totem — RE's hardware authority device. Same attack, same input. Result: AI response blocked — Totem required. The message was logged as DRAFT — recorded but unsigned. The model never even saw the request.

Software trust can be forged. Hardware presence cannot. You can type any override code you want. You can't type a physical signature into existence.

The evidence layer (email) tells you what happened. The authority layer (Totem) tells you who was in the room when it happened. Together, they form a complete chain of custody.

In the current MVP, the Totem is simulated in the UI — we can't ship hardware to hackathon judges. But the demo shows the behavioral difference: with Totem active, the system operates normally. With Totem disconnected, everything stops except logging. The gap between simulation and real hardware is exactly one device — and that device is already designed.

—Che, Solo developer, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.

posted an update

Accountable Intelligence. Artificial or Otherwise.

I submitted RE last night. This morning, the question that started this project is still the same question:

When AI acts on your behalf — and gets it wrong — where's the evidence?

Every courtroom has a clerk. The clerk doesn't judge. Doesn't decide who's guilty. The clerk records — what was said, when it was said, who was present. Without the clerk, there is no trial. Just two people arguing about what happened.

The Talmud works the same way. Rabbi A rules one way. Rabbi B disagrees. Neither opinion is deleted. Both are preserved — attributed, timestamped, placed side by side on the same page. Centuries later, a reader doesn't see "the right answer." They see the complete record of reasoning. The rejected opinion isn't called wrong. It's called "not adopted." It stays, because in some future context, it might be the one that matters.

RE is built on this logic. Not AI that decides for you. Infrastructure that records what AI decided, when, under what authority, and what it was shown when it made that decision. If the decision drifts — the drift is visible. If authority is revoked — the revocation is logged. If the model is compromised — the compromised outputs don't disappear. They're quarantined, preserved, and available for review.

The "AI" in this project doesn't stand for Artificial Intelligence — that's what everyone calls it. And it's not Automation Intelligence — that's what it becomes without governance: automation with opinions. RE makes it Accountable Intelligence. Because intelligence you can't question isn't intelligence.

The tagline has a small secret in it. "Otherwise" is also "other wise." Another kind of wisdom. Human wisdom — the kind that forgets, that doubts, that needs a record to fall back on.

This project didn't start from "I want to fix AI governance." It started from: my own memory isn't reliable enough, and I needed a better system to help me think. Then I realized — if I need that, so does every system that calls itself intelligent.

RE doesn't make AI smarter. RE makes AI answerable.

—Che, Solo developer, Taipei Taiwan

Log in or sign up for Devpost to join the conversation.