PHYSICIAN

Physician UI
Use case: uploading a robotic arm to predict future movement outcome
Physician predicting outcome of robotic arm movement
safe move: Physician predicted a safe move by the robotic arm
Physician blocked a dangerous movement by the robotic arm

Inspiration

I’m a Mechanical Engineering student who loves building things that move. I read the equations before I read the headlines: forces, friction, balance. I also study AI because it will shape how machines move and decide in our world. I’ve seen demos of “smart” systems confidently issuing commands that would be obviously dangerous to any engineer, robots told to “move quickly” on slippery floors, drones instructed to operate under gusty wind with no prediction of torque changes, arms asked to lift loads with a dangerously high center of gravity.

That gap between digital intelligence and mechanical reality frightened me. Machines are increasingly entrusted with human safety, in factories, on roads, and in public spaces, yet many of them operate without a basic understanding of physics. I built PHYSICIAN because humanity deserves systems that treat physical reality as a first-class constraint, not an afterthought. This project is my small attempt to give AI the mechanical intuition I learned in class: the instinct to respect gravity, friction, and balance, and the humility to refuse an action when physics says “no.”

What it does

PHYSICIAN is a Physical OS / Verification Gate that sits between high-level AI intent and physical actuation. Its job is simple and critical:

Observe — take a camera frame or telemetry and a proposed action.
Infer — use vision + reasoning to estimate physical parameters (mass, friction μ, center of mass offset, slope).
Verify — instantiate those estimates in a deterministic Digital Twin (PyBullet) and simulate the action over a short horizon (T+2s).
Decide — return a binary, auditable verdict: GO (safe) or BLOCKED (unsafe), plus an explanation and a Trace ID.

PHYSICIAN prevents unsafe motion, explains the failure (“Traction loss at t=1.2s — slip > 2 m”), and records the proof so humans can audit what happened.

How we built it

User / AI Intent ↓ (Verify) Perception (image / telemetry) ↓ Reasoning Layer (Gemini 3 Flash) → produces Physics JSON (mass, μ, CoM, action vector) ↓ Bridge API (FastAPI + Pydantic validation) ↓ Physics Kernel (PyBullet Digital Twin) ↓ Safety Gate (threshold checks) → GO / BLOCK + Trace ID ↓ Operator Dashboard / Controller

Key Components

Perception & Reasoning (The Brain)

Gemini 3 Flash (Vision-Language-Action) is prompted with a structured "Chain of Causation" template that returns strict JSON: { mass: float, friction: float, com_offset: [x,y,z], action: {...} }.
- Pydantic schemas validate and type-check every value to prevent hallucinated formats.

Bridge & API -FastAPI receives intent & telemetry, validates with Pydantic, logs requests, queues verification jobs, and returns synchronous or async verdicts.

Physics Kernel (The Body / Ring-0) -PyBullet runs headless deterministic simulations of the digital twin over a T+2s horizon. -Simulation applies the AI-inferred constants and the intended actuation, then records metrics: slip distance, tilt angle, contact forces, time-to-instability.

Safety Governor -Rule-based checks (configurable thresholds) evaluate simulation outputs. Example thresholds used in prototype: slip_distance > 2.0 m or tip_angle > 0.2 rad → BLOCK. -Each BLOCK produces a cryptographically-signed Verification Trace ID and human-readable forensic explanation.

Operator Interface -React + Vite dashboard shows: AI’s physics JSON, simulation replay (simplified), verdict, and forensic panel for auditors.

Challenges we ran into

Translation Gap (LLM → Float) LLMs speak prose. Physics engines take floats. We enforced Physics JSON via strict Pydantic schemas and prompt templates that force the model to return numeric estimates plus confidence scores. Where vision cues are weak, PHYSICIAN adopts a pessimistic bias (assume lower μ) to prefer safety.

Latency vs Safety Simulation + reasoning must be fast. We implemented a two-path strategy: -Fast Path: simplified checks (coarse friction categories + linear stability) for low-risk actions. -Deep Audit: full PyBullet simulation for high-energy or high-risk intents. We use async queues and exponential backoff for model rate limits.

Non-deterministic surfaces & noisy vision Visual textures are ambiguous. We fused secondary cues (specular highlights, shadow gradients) and applied conservative uncertainty margins to simulation inputs.

Deployment complexities PyBullet is compute-heavy. We built the verification service to run in headless, containerized environments (or WSL/ Linux/ cloud instances) and designed the UI to call verification by user-trigger only.

Accomplishments that we're proud of

Zero-to-Physics Pipeline: From a single image to a functioning digital twin and verified safety verdict in seconds (prototype).

Forensic Clarity: PHYSICIAN can distinguish failure modes (Tip vs Slide) and produce human-readable explanations for audits.

Extensible OS Design: The verification layer is modular, replace Gemini with any reasoning model, swap PyBullet with a higher-fidelity solver, or integrate with ROS2 / automotive controllers.

What we learned

LLMs can estimate hidden physical variables if forced with the right prompts and schema; but they must be verified.

Constraints create capability — by forcing physics checks you reduce catastrophic failures and improve system trust.

Human-in-the-loop verification is pragmatic: only verify on explicit demand to balance cost and safety.

What's next for PHYSICIAN

1.** Real-time video verification** — move from single-frame checks to a sliding-window forecast (simulate <5s into the future).

Digital Twin Sync & HIL — integrate the kernel with real robot arms (UR series) to interrupt motors before physical failure.
Hardware Drivers — create “drivers” for different hardware stacks so PHYSICIAN can run as an embedded safety module (eventual C++ SDK for sub-ms checks).
Fleet & Dashboard — centralized Kinetic Health dashboard for entire fleets reporting trace IDs, near-misses, and trends.
Multi-object & Fluid Support — extend to stacking, multi-agent interactions, and soft-body approximations.

Auditing & Responsible Deployment

Every blocked action is accompanied by:

A deterministic simulation log (snapshot of inputs & results)
A cryptographically-signed Verification Trace ID for forensic replay
A human-readable explanation generated by Gemini that connects cause → effect

This design enables legal, regulatory, and safety teams to review incidents with full reproducibility.

Built With

gemini
google-ai-studio
google-generative-ai-sdk
react
tailwind-css
typescript

Updates

Ayatullah Hanif Showunmi started this project — Feb 07, 2026 09:07 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.