Inspiration
LLM apps are flying blind in production.
We kept seeing the same pattern: developers ship something that works, then immediately lose all visibility once traffic, latency, and costs show up. Traditional observability tools either drown you in config or completely ignore LLM-specific realities like prompts, tokens, and model latency.
Tracey was inspired by a simple question: “Why does tracing LLM apps feel 10x harder than building them?”
We wanted tracing that feels like writing JavaScript, not configuring infrastructure.
What it does
Tracey is a lightweight tracing SDK purpose-built for LLM applications in Node.js.
It gives you:
- Automatic HTTP tracing (incoming + outgoing)
- First-class LLM spans (model, prompt, response, token usage, latency)
- End-to-end request context across HTTP → LLM → DB
- Zero-config setup with production-ready exporters
All of that in under 200 lines of core code, without pulling in the OpenTelemetry universe.
If your app calls an LLM, Tracey tells you what happened, where it happened, and how much it cost.
How we built it
Tracey is intentionally simple by design.
Core building blocks
Instrumentation Layer
- Monkey-patches
http/https - Express middleware for server spans
- Explicit
withLLMSpan()helper for LLM calls
Tracer Core
- Span lifecycle management
- Context propagation using
AsyncLocalStorage - Cryptographically secure trace/span IDs
Exporter Layer
- Console (dev)
- File (staging)
- Datadog Logs API (production)
Everything runs synchronously to guarantee spans are flushed before process exit. No collectors, no agents, no sidecars.
The result: tracing you can read, debug, and extend in one sitting.
Challenges we ran into
Async context propagation Node’s async model is brutal if you don’t get it right.
AsyncLocalStoragesaved us, but required careful handling to avoid context leaks.LLM response normalization Different providers expose usage data differently. We designed flexible parsing so token & cost tracking still works.
Monkey-patching safely Patching HTTP without breaking apps is harder than it sounds. We ensured patches run once and add near-zero overhead.
Keeping it small Every feature had to justify its existence. If it bloated the SDK, it was cut.
Accomplishments that we're proud of
- 🚀 <200 lines of core tracing logic
- 🧠 LLM-aware spans (not retrofitted)
- ⚡ Zero-config auto-instrumentation
- 🔗 Correct parent-child spans across async boundaries
- 📦 <50KB package size
- 🧑💻 Readable JSON traces you can debug without a UI
- 🏁 Hackathon-ready but production-safe
Most tracing tools are black boxes. Tracey is something you can actually understand.
What we learned
- Observability tooling doesn’t need to be complex to be powerful
- LLMs need first-class tracing, not generic spans
- Async context propagation is the hardest part — everything else is details
- Developers trust tools they can read
🗺️ ## What’s next for Tracey
Core Stability (Current)
- HTTP client/server instrumentation
- LLM helpers with token tracking
- Console, file, and Datadog exporters
- Async context propagation
- Comprehensive test suite
- TypeScript definitions
Advanced Features
- Undici / fetch instrumentation (modern HTTP clients)
- Distributed tracing with W3C
traceparentheaders - Sampling and rate limiting
- Span batching for exporters
- Metrics integration (counters, histograms)
Ecosystem
- More exporters (Jaeger, Zipkin, Honeycomb)
- Pre-built instrumentations (MongoDB, Redis, Postgres)
- Native LLM provider SDKs (OpenAI, Anthropic, Cohere)
- Dashboard UI for span visualization
* Performance benchmarks and optimization
🔍 How Tracey Compares
| Feature | Tracey | OpenTelemetry | Custom Logging |
|---|---|---|---|
| Setup time | 5 minutes | 2–4 hours | 1–2 days |
| Bundle size | <50 KB | 10+ MB | Varies |
| LLM-specific features | ✅ Built-in | ❌ Manual | ❌ Manual |
| Auto HTTP tracing | ✅ Yes | ✅ Yes | ❌ No |
| Context propagation | ✅ Async-aware | ✅ Yes | ❌ Manual |
| Cost | Free | Free | Free |
| Learning curve | 📚 Minimal | 📚📚📚 Steep | 📚📚 Medium |
The long-term direction is simple:
Make LLM observability boring, predictable, and cheap.
Tracey will continue to stay:
- Small instead of bloated
- Explicit instead of magical
- LLM-first instead of retrofitted
If OpenTelemetry is Kubernetes-level observability, Tracey is "npm install and ship" observability for LLM apps.
Log in or sign up for Devpost to join the conversation.