Confidential Inference With TEE?

Inspiration

Almost everyone has asked a LLM a question they might not want anyone else to see. Coming up with a way to ensure that prompts are actually kept secret are imperative, both to ensure personal information is kept private, and to ensure company secrets aren't leaked by developers looking to vibecode.

What it does

A LLM is run on a Trusted Execution Environment (TEE). TEEs are hardware components that are a part of the TEE. Information on the TEE cannot be accessed by the server. A guarantee that a TEE was actually used to process the requests comes in form of an "attestation" Queries to and from that TEE go through a secure TLS channel. The server the TEE is run on can never access the prompts to the TEE.

How we built it

We ran TinyLLama on the TEE. The server and client were implemented in Python and the verification of the attestation is done in Rust.

Challenges we ran into

Coming up with a protocol, Python-Rust Interface

Accomplishments that we're proud of

Coming up with a complete system that ensures secure inference.

What we learned

We got way deeper into which security questions even need to be considered with regards to AI. For most of our team, it was the first time working with TEEs.