posted an update

Update: Mentus is Alive! Multi-modal Mentoring established.

We just hit a major milestone in the development of Mentus! After a intense battle with real-time streaming quotas, we successfully pivoted to a robust, high-performance REST-based architecture that brings the AI Mentor to life.

What's new in this update:

  • Vision Integration: Mentus now captures visual data every 10 seconds to analyze the user's environment and posture.
  • Voice Interaction: Implemented local Speech-to-Text (STT), allowing users to ask questions hands-free while performing tasks.
  • Cognitive Brain: Powered by Gemini 1.5 Flash, the system provides contextual advice based on both what it sees and what it hears.
  • Premium UI: Completely redesigned the interface from scratch. We moved away from the "sci-fi" look to a clean, minimalist "Modern Tech" aesthetic (think Apple/Tesla), prioritizing focus and usability.

Mentus can now recognize gestures, correct camera angles, and respond to verbal inquiries—all while maintaining a smooth, stable connection. Next stop: refining the domain-specific knowledge (Cooking/DIY modes)!

#GoogleGeminiHackathon #BuildWithGemini #AI #NextJS #MultimodalAI

Log in or sign up for Devpost to join the conversation.