- Inspiration
Most AI systems today are limited to text-based interaction. I wanted to build an AI that feels more alive — something closer to a real assistant like Jarvis. CENTURI was created as an exploration assistant that can listen to users, analyze images through the camera, and respond intelligently using Gemini AI.
- What it does
CENTURI is an AI assistant that can:
Listen to user voice commands
Understand questions using Gemini AI
Respond with synthesized speech
Analyze images captured from the camera
Provide explanations of objects or scenes
This creates a more natural human-AI interaction where users can talk to the AI and show objects to it.
- How we built it
CENTURI was built using Python and Google's Gemini AI models. The system integrates multiple technologies to create a multimodal AI experience:
Gemini AI for reasoning and vision
Faster-Whisper for speech recognition
gTTS for voice responses
Streamlit for the interactive web interface
OpenCV for camera capture
These components work together to allow the AI to hear, see, and speak.
- Challenges we ran into
The biggest challenge was integrating multiple AI components together in real time. Managing speech recognition, camera input, and AI responses while handling API limits required careful debugging and system design. Another challenge was ensuring the system worked smoothly in a local environment while preparing it for demo presentation.
- Accomplishments that we're proud of
We successfully created a multimodal AI system that can interact with users using voice and vision rather than only text. CENTURI demonstrates how AI can move beyond static chat interfaces toward more immersive experiences.
- What we learned
This project helped us learn how to build multimodal AI systems that combine voice, vision, and large language models. We also learned how to integrate different AI tools and frameworks into a single interactive application.
- What's next for CENTURI
Future improvements include adding real-time voice conversations, improving visual recognition, deploying the system to the cloud, and expanding CENTURI into a fully autonomous AI assistant capable of performing tasks across applications.
Log in or sign up for Devpost to join the conversation.