Inspiration The AI Lab Assistant was built to solve a simple problem: when working on complex experiments, engineers and scientists often have their hands full with gloves, tools, or physical hardware. Stopping a procedure to search the web or check a 100-page equipment manual is inefficient. We wanted a lightweight, ever-present voice assistant running off cheap hardware (like a Raspberry Pi) capable of answering highly specific technical questions without touching a keyboard.
How we built it We architected the system entirely in Python to ensure perfect cross-compatibility between macOS and arm64 Linux. The core pipeline uses PyAudio and SpeechRecognition to run an efficient background detection loop that only triggers on the wake word. Transcriptions are handled by OpenAI Whisper to guarantee acoustic accuracy against lab volume. Queries are then funneled into duckduckgo-search for live internet context. For local document retrieval, we bypassed heavy Vector databases entirely; we built a custom RAG engine using pypdf to extract text from manuals and relied strictly on numpy to calculate cosine similarity against OpenAI embeddings. Finally, GPT-4o-mini formats the combined context and outputs it back out of the OS speakers using pygame.
Challenges we ran into Deploying across completely different edge architectures completely broke our assumptions. Native offline neural-network wake words required complex, uncompilable C++ bindings on the Raspberry Pi, and macOS windowing blocked global hotkeys (pynput) via system sandboxing. To solve this, we replaced offline models with a smart, low-latency STT audio burst loop and engineered a raw sys.stdin threading override to bypass macOS permissions entirely.
What we learned We learned that avoiding bloated frameworks (like LangChain) and heavy databases is critical for resource-constrained edge devices. Leveraging a pure Python system utilizing raw numpy array math and standard HTTP APIs created an identical, massively faster product. Overall, we learned how to deeply constrain LLMs using precise system prompts to transform visual-markdown reasoning directly into natural-sounding conversational English.
Built With
- openai
- python
Log in or sign up for Devpost to join the conversation.