Update: Mentus is Alive! Multi-modal Mentoring established.
We just hit a major milestone in the development of Mentus! After a intense battle with real-time streaming quotas, we successfully pivoted to a robust, high-performance REST-based architecture that brings the AI Mentor to life.
What's new in this update:
- Vision Integration: Mentus now captures visual data every 10 seconds to analyze the user's environment and posture.
- Voice Interaction: Implemented local Speech-to-Text (STT), allowing users to ask questions hands-free while performing tasks.
- Cognitive Brain: Powered by Gemini 1.5 Flash, the system provides contextual advice based on both what it sees and what it hears.
- Premium UI: Completely redesigned the interface from scratch. We moved away from the "sci-fi" look to a clean, minimalist "Modern Tech" aesthetic (think Apple/Tesla), prioritizing focus and usability.
Mentus can now recognize gestures, correct camera angles, and respond to verbal inquiries—all while maintaining a smooth, stable connection. Next stop: refining the domain-specific knowledge (Cooking/DIY modes)!
#GoogleGeminiHackathon #BuildWithGemini #AI #NextJS #MultimodalAI
Log in or sign up for Devpost to join the conversation.