GPT‑OSSynapse

A glass‑skull, GPIO‑driven light sculpture that makes large‑language‑model activity visible in the physical world.


TL;DR

GPT‑OSSynapse takes activation signals from an open‑source GPT‑class model (gpt‑oss‑20b | gpt-oss-120b) and maps them in real time to clusters of LEDs embedded inside a glass skull. Each “neural‑LED” group represents a summarized slice of the model’s internal state (token pace, reasoning/completion tokens), producing a living, interpretable light display for prompts and responses.


Inspiration

Long before I’d heard of Geoffrey Hinton, Alec Radford, or OpenAI’s GPT‑3 private beta, I was obsessed with the junction of philosophy, technology, and the mind. Neuromancer, Her, Westworld, HBO’s Silicon Valley, Vanilla Sky, Total Recall, The Matrix - all versions of the same (gl)itch: to manifest the mind. According to Carl Sagan, the universe might be doing something similar:

“We are a way for the universe to know itself.”

In 2018, while taking data science courses at U of T, I came across early NLP models like word2vec and GloVe and first learned about OpenAI. The math stretched my brain, but the core idea stuck: Hinton had shown a path to building general-purpose thinking machines, and folks were well on their way bringing the technology to the masses.

GPT‑OSSynapse is my literal take on thinking machines: turning neural network activations into photons.


What it does

GPT‑OSSynapse is an electronic‑brain sculpture. A text prompt flows through a GPT‑class model; a small “tap” process extracts lightweight telemetry that streams to a Raspberry Pi which drives clusters of addressable LEDs inside a glass skull. The goal isn’t strict mechanistic interpretability; it’s embodied intuition - a visceral sense for pacing, difficulty, confidence, and surprise as the model “thinks.”

  • Reasoning tokens/sec → LED A (Red)
    • Source: streaming delta.reasoning from openai/gpt-oss-20b | openai/gpt-oss-120b via OpenRouter
    • Pattern: slow sinusoidal cross‑fade; intensity scales with EWMA of tps
  • Completion tokens/sec → LED B (White)
    • Source: streaming delta.content (if present)
    • Pattern: Lissajous sweep; intensity scales with EWMA of tps
  • Bursts (≥30 tps, either channel) → dual‑channel heartbeat accent
  • Idle (no tokens for >100 ms) → LEDs fully off (PWM stopped, pins LOW)
  • Smoothing: EWMA (per‑channel) + gamma correction, no smoothing at zero

How we built it

Hardware

  • Glass skull (hollow; bottom‑drilled 6.5 mm access port). Drilling: pilot hole with a small glass & tile bit, then stepped up to a 1/4” bit for final size; slow RPM, light pressure, constant water for cooling, and deburred the edge.
  • Raspberry Pi 4 (2 GB)
  • Flexible enamelled copper wire LED strips ~120‑200 nodes
  • 5.1V 5.0A DC regulated PSU

Software

  • Inference (cloud): OpenRouter API with openai/gpt-oss-20b | openai/gpt-oss-120b. We consume the live SSE stream and classify each chunk:
    • If delta.reasoning has text → channel="reasoning"
    • Else if delta.content has text → channel="completion"
  • Transport (to Pi): HTTP POST to the Pi’s local LED server at /update with { channel, delta, duration } where delta/duration = tokens/sec estimate for the last batch. UDS is supported for localhost piping.
  • Controller (Pi): led_chat_tokens.py
    • Dual PWM driver (GPIO BOARD 40=A, 33=B) with on‑demand PWM start, immediate stop + pins LOW at idle
    • Per‑channel EWMA smoothing + pattern mapper
  • Profiles: Cross‑fade (reasoning), Lissajous (completion), Heartbeat (bursts), Off (idle)

Physical layout: Dual channels (A/B) wired to red/white LED groups Reasoning → A, Completion → B. Idle state keeps both channels LOW.


Challenges

  • Power integrity: Early brown‑outs caused LED resets and Pi USB dropouts. Fixed with official Raspberry Pi 27W USB-C PSU,
  • GPIO jitter: WS2812 timing was noisy under Linux pre‑emption. Moved LED writes to DMA‑driven library and isolated heavy compute from the Pi.
  • Mapping the firehose: Raw activations are massive. Summarization (bucketed layer norms + entropy) keeps it <2 kB/s.
  • Idle flicker on PWM: Even 0% duty could flicker with constant updates. Fixed by gating updates, stopping PWM at idle, and driving pins LOW.

Learnings

  • Skulls are hard—but crack easily. Drill slow, tape surfaces, step up bit sizes, and support the interior.
  • Smoothing is sanity. A moving average on metrics keeps motion organic without hiding spikes of insight.
  • Interpretability ≠ proof. Lights are intuition, not theorems.
  • Decouple everything. Inference, transport, and lighting each fail differently; loose coupling = resilience.

For Future Iteration

  1. LEDs grouped into “lobes”: Frontal, temporal L/R, parietal, occipital, brainstem ring. Group indices map to metric channels in software.
  2. Next.js Web App: Web app with video feed & xterm integration for web-based interaction
  3. RP2040 variant: Mini-skulls with Feather-style board
  4. 3D‑printed lattice: Internal LED scaffold that aligns “lobes” consistently across builds.
  5. Open source kit: BOM, STLs, wiring, and a Docker compose for the full stack.

Code Samples

## Token Tap
if 'choices' in data and data['choices']:
    choice = data['choices'][0]

    if 'delta' in choice:
        delta = choice['delta']
        content = ""
        channel = "completion"

        # Check for content in the content field
        if 'content' in delta and delta['content']:
            content = delta['content']
        # Check for content in the reasoning field (GPT-OSS models)
        elif 'reasoning' in delta and delta['reasoning']:
            content = delta['reasoning']
            channel = "reasoning"

        if content:
            # Send token rate to LED controller
            self._update_led_channel(channel, tokens, elapsed)
## LED Map
def map_rate_to_pattern(t: float, tokens_per_second: float) -> tuple[float, float]:
    rate = max(0.0, tokens_per_second)
    if rate < 0.5:
        return pattern_idle_breathe(t)
    intensity = clamp(math.log10(1.0 + rate) / math.log10(1.0 + 200.0), 0.0, 1.0)
    if rate < 15:
        return pattern_active_crossfade(t, intensity)
    if rate < 80:
        return pattern_stream_lissajous(t, intensity)
    return pattern_burst_heartbeat(t, intensity)

Four patterns:

  • Idle Breathe (< 0.5 TPS) - gentle breathing
  • Active Crossfade (0.5-15 TPS) - LEDs fade in opposite phases
  • Stream Lissajous (15-80 TPS) - mathematical curves
  • Burst Heartbeat (> 80 TPS) - double-pulse heartbeat

Channel mapping:

  • Reasoning tokens → LED A (slower, subtle)
  • Completion tokens → LED B (direct intensity)

  • Colabs welcome (hardware refinement, museum install, or live performance).

Built With

Share this project:

Updates