๐Ÿค– Mohanโ€™s Mini Chatbot

๐Ÿš€ About the Project

๐ŸŽฏ Inspiration

The idea sparked from a simple thought: "What if a single AI assistant could do everything ChatGPT doesโ€”but also handle files, images, videos, speech, and multilingual tasks in a unified interface?"

As a student juggling academic queries, document summaries, YouTube research, and communication barriers, I needed a productivity-focused solution. Thus, Mohanโ€™s Mini Chatbot was bornโ€”a smart AI assistant tailored for students, educators, and knowledge workers.


๐Ÿ› ๏ธ How I Built It

The project is a full-stack AI-powered chatbot, constructed using:

  • Frontend: HTML, CSS, JavaScript
  • Backend: Flask (Python)
  • Databases:
    • MySQL for secure user authentication
    • MongoDB for chat history and logs
  • APIs & Tools:
    • Gemini API for AI interactions
    • Google Cloud APIs (Translation, Weather, TTS)
    • YouTube scraping & transcript APIs
    • Whisper + Pyttsx3 for voice input/output
    • Langdetect, TextBlob for multilingual processing

The modular architecture makes it easy to plug in new capabilities without affecting the core functionality.


โœจ Features at a Glance

Feature Description
๐Ÿ” User Authentication Sign up, login, and password reset
๐Ÿ’ฌ AI Chat ChatGPT-style conversations via Gemini API
๐Ÿ–ผ๏ธ Image Analysis Analyze and extract information from images
๐Ÿ“„ File Analyzer Extract content from PDF, DOCX, PPT
๐ŸŽค Voice Input & Output Talk to the bot and hear responses
๐Ÿ—บ๏ธ Weather & News Live updates from trusted APIs
๐ŸŒ Multilingual Support Detects language & translates intelligently
๐Ÿ“บ YouTube Summarizer Extract and summarize videos using transcripts
๐Ÿ•’ Chat History Date-filtered search with copy & highlight

๐Ÿ“š What I Learned

  • ๐ŸŒ Full-stack integration: Building scalable Flask APIs and frontend interactions.
  • ๐Ÿง  NLP & Speech Processing: Applying Gemini, Whisper, and multilingual translation libraries.
  • ๐Ÿ—ƒ๏ธ Data Design: Structuring authentication + user logs with hybrid SQL + NoSQL systems.
  • ๐Ÿงช Testing & Debugging: Handling edge cases like missing file formats, unsupported languages, or mic/browser permissions.

โš ๏ธ Challenges Faced

  1. YouTube Summarization

    • โ— Challenge: Many videos had disabled transcripts.
    • โœ… Solution: Used fallback scraping and adaptive error handling.
  2. Voice-to-Text Accuracy

    • โ— Challenge: Background noise and mic permission issues
    • โœ… Solution: Integrated Whisper for robust transcription.
  3. Multilingual Detection & Translation

    • โ— Challenge: Detecting edge-case languages like Sinhala or mixed scripts.
    • โœ… Solution: Combined langdetect + TextBlob for fallback safety.

๐Ÿง  Bonus: Math & AI Concepts

Using LaTeX, here's an example of how we calculate the area of a circle from user-uploaded geometry diagrams:

The area is given by:

$$ A = \pi r^2 $$

And to extract radius ( r ) from shapes, the model uses:

\[ r = \frac{d}{2} \]


๐ŸŒŸ Final Thoughts

โ€œGreat things take time โ€” and this was worth every minute.โ€ โณ

This project isn't just a chatbot; itโ€™s a learning companion, personal assistant, and productivity toolโ€”all rolled into one.


Share this project:

Updates