๐ค Mohanโs Mini Chatbot
๐ About the Project
๐ฏ Inspiration
The idea sparked from a simple thought: "What if a single AI assistant could do everything ChatGPT doesโbut also handle files, images, videos, speech, and multilingual tasks in a unified interface?"
As a student juggling academic queries, document summaries, YouTube research, and communication barriers, I needed a productivity-focused solution. Thus, Mohanโs Mini Chatbot was bornโa smart AI assistant tailored for students, educators, and knowledge workers.
๐ ๏ธ How I Built It
The project is a full-stack AI-powered chatbot, constructed using:
- Frontend: HTML, CSS, JavaScript
- Backend: Flask (Python)
- Databases:
- MySQL for secure user authentication
- MongoDB for chat history and logs
- MySQL for secure user authentication
- APIs & Tools:
- Gemini API for AI interactions
- Google Cloud APIs (Translation, Weather, TTS)
- YouTube scraping & transcript APIs
- Whisper + Pyttsx3 for voice input/output
- Langdetect, TextBlob for multilingual processing
The modular architecture makes it easy to plug in new capabilities without affecting the core functionality.
โจ Features at a Glance
| Feature | Description |
|---|---|
| ๐ User Authentication | Sign up, login, and password reset |
| ๐ฌ AI Chat | ChatGPT-style conversations via Gemini API |
| ๐ผ๏ธ Image Analysis | Analyze and extract information from images |
| ๐ File Analyzer | Extract content from PDF, DOCX, PPT |
| ๐ค Voice Input & Output | Talk to the bot and hear responses |
| ๐บ๏ธ Weather & News | Live updates from trusted APIs |
| ๐ Multilingual Support | Detects language & translates intelligently |
| ๐บ YouTube Summarizer | Extract and summarize videos using transcripts |
| ๐ Chat History | Date-filtered search with copy & highlight |
๐ What I Learned
- ๐ Full-stack integration: Building scalable Flask APIs and frontend interactions.
- ๐ง NLP & Speech Processing: Applying Gemini, Whisper, and multilingual translation libraries.
- ๐๏ธ Data Design: Structuring authentication + user logs with hybrid SQL + NoSQL systems.
- ๐งช Testing & Debugging: Handling edge cases like missing file formats, unsupported languages, or mic/browser permissions.
โ ๏ธ Challenges Faced
YouTube Summarization
- โ Challenge: Many videos had disabled transcripts.
- โ Solution: Used fallback scraping and adaptive error handling.
- โ Challenge: Many videos had disabled transcripts.
Voice-to-Text Accuracy
- โ Challenge: Background noise and mic permission issues
- โ Solution: Integrated Whisper for robust transcription.
- โ Challenge: Background noise and mic permission issues
Multilingual Detection & Translation
- โ Challenge: Detecting edge-case languages like Sinhala or mixed scripts.
- โ Solution: Combined langdetect + TextBlob for fallback safety.
- โ Challenge: Detecting edge-case languages like Sinhala or mixed scripts.
๐ง Bonus: Math & AI Concepts
Using LaTeX, here's an example of how we calculate the area of a circle from user-uploaded geometry diagrams:
The area is given by:
$$ A = \pi r^2 $$
And to extract radius ( r ) from shapes, the model uses:
\[ r = \frac{d}{2} \]
๐ Final Thoughts
โGreat things take time โ and this was worth every minute.โ โณ
This project isn't just a chatbot; itโs a learning companion, personal assistant, and productivity toolโall rolled into one.
Log in or sign up for Devpost to join the conversation.