The first undersized print of the one-eyed robot (inspired by a projector)

Scott Bernard: Offline Real-Time Linear Algebra Tutor 🤖👁️

Inspiration

We wanted to create a tutor that could work completely offline, protecting student privacy while still delivering powerful, real-time instruction. Linear algebra was the natural starting point because it’s foundational to so many STEM fields, yet often challenging for students to grasp. Inspired by X-Men’s Cyclops (with a single “eye”) and a touch of humor in the name “Bernard,” we envisioned a cyclopean tutor that blends AI, vision, and voice into one accessible learning companion.

What it does

Scott Bernard is an offline, multimodal AI tutor that helps students learn linear algebra in real time. It can:

Analyze problems via computer vision (camera or uploaded images).
Hold natural voice conversations through speech recognition and text-to-speech.
Provide step-by-step explanations powered by GPT-OSS.
Run locally or on a remote GPU for cost efficiency while maintaining a responsive WebSocket interface.
Live inside a custom 3D-printed housing with a webcam “eye,” making the hardware part of the learning experience.

How we built it

AI Backbone: Deployed GPT-OSS with vLLM for efficient offline inference.
Multimodal Pipeline: Combined computer vision, speech recognition, and TTS into a Dockerized system.
Interface: Built a real-time WebSocket server for live video, audio, and chat.
Hardware: Designed and 3D-printed a cyclopean robot head in Fusion360 to house the camera.
Deployment: Ran the model locally and also on Vast.ai GPU rentals for scalable, cost-effective compute.
Collaboration: Used GitHub for code integration, virtual meetings for coordination, and university MakerSpace for hardware printing.

Challenges we ran into

GPU allocation issues with Google Compute Engine A100s.
First 3D print was undersized, requiring re-design and re-print.
Balancing low latency real-time interactions with the constraints of offline deployment.
Coordinating software integration across multiple moving parts (voice, vision, chat) under tight hackathon timelines.

Accomplishments that we're proud of

Successfully creating a working multimodal AI tutor that runs offline.
Completing and testing a functional 3D-printed housing.
Building a WebSocket interface that supports live video, audio, and text seamlessly.
Deploying both locally and on Vast.ai with cost-saving auto-shutdown scripts.
Bringing together AI, hardware, and education into one cohesive project.

What we learned

The importance of rapid iteration in hardware design (our second print was far better suited).
How to integrate multiple modalities (speech, vision, and text) into a single tutoring experience.
Best practices for GPU deployment across cloud platforms and Vast.ai optimization.
That offline AI systems can still be highly interactive and student-friendly when designed carefully.
The value of balancing technical ambition with practical usability during a hackathon.

What's next for Scott Bernard

Hardware Improvements: Refine the 3D-printed design to be more durable, portable, and adaptable for classroom or personal use.
Model Training: Move beyond inference and actually fine-tune models with PyTorch to improve tutoring accuracy and personalization.
Advanced Devices: Experiment with running the system on cutting-edge phone hardware, bringing offline multimodal tutoring directly to students’ pockets.

Built With

autodesk-fusion-360
computer-vision
coqui-tts
docker
github
google-compute-engine
gpt-oss
python
pytorch
vast.ai
vllm
websockets

Updates

Basil Udo started this project — Sep 11, 2025 06:49 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.