💡 SupportIQ – AI Customer Support Bot

“Smart, context-aware AI support that never sleeps — powered by local intelligence.”


🌟 Inspiration

Every business today struggles with repetitive customer queries — password resets, order tracking, refund requests — that take up valuable human time and delay real issue resolution.
While AI chatbots exist, most are cloud-dependent, expensive, and contextually shallow, unable to maintain conversation memory or escalate intelligently.

We wanted to change that.
The inspiration for SupportIQ came from a simple idea:

“Can we build a customer support bot that thinks like a human, runs offline, and costs nothing?”

Our goal was to design an AI-powered support assistant that could understand natural language, remember context, and provide meaningful help without needing an API key or constant internet connection.

That’s how SupportIQ was born — an offline, intelligent customer support chatbot powered by FastAPI, Streamlit, and Ollama’s Llama 3.2 (3B) model, built to make support faster, smarter, and more human.


⚙️ How We Built It

We approached the project with a modular architecture — dividing it into frontend, backend, and AI engine layers for easier scalability and testing.

🔹 Backend (FastAPI)

  • Designed RESTful APIs for /chat, /sessions, and /escalations endpoints.
  • Implemented SQLite with SQLAlchemy ORM to manage conversation sessions and escalation logs.
  • Built a hybrid response system:
    • The bot first searches a FAQ dataset.
    • If no relevant answer is found, the query is passed to Ollama’s local LLM (Llama 3.2 3B) for contextual response generation.
  • Integrated CORS Middleware to enable communication with the Streamlit frontend.

Mathematically, the FAQ retrieval process uses cosine similarity to find the most relevant response:

[ \text{Similarity}(Q_i, Q_j) = \frac{Q_i \cdot Q_j}{|Q_i| |Q_j|} ]

We set a confidence threshold ( \tau = 0.75 ), where any similarity score below ( \tau ) triggers an LLM-based response instead of an FAQ match.

🔹 Frontend (Streamlit)

  • Created a clean, chat-style interface with message history and typing simulation.
  • Connected to the backend via REST endpoints using requests.post() calls.
  • Managed session tracking through Streamlit’s st.session_state to preserve conversation flow.

🔹 LLM Integration (Ollama)

  • Deployed Llama 3.2:3B locally using Ollama to generate replies when FAQs fail.
  • Configured response summarization and next-action suggestions directly from LLM output.
  • Runs fully offline, ensuring zero cost and full data privacy.

🔹 Dataset

  • Built a local dataset (faq.sample.csv) of 50+ common support queries and answers.
  • Designed the retrieval logic with TF-IDF Vectorization and cosine similarity.

🧠 What We Learned

Building SupportIQ was a crash course in applied AI engineering — merging backend APIs, frontend design, and local AI modeling.

We learned:

  • How to integrate local LLMs (Ollama) for offline, cost-free AI solutions.
  • To use FastAPI for building scalable and asynchronous backends.
  • To handle session memory and database persistence for real-world chat applications.
  • The importance of hybrid intelligence — combining deterministic FAQ systems with generative AI for flexibility.
  • And most importantly, how to design a user experience that feels natural, responsive, and human.

This project strengthened our understanding of both LLM lifecycle management and full-stack AI system design.


🚧 Challenges We Faced

Like every good hackathon project, SupportIQ came with its fair share of bugs, errors, and sleepless nights.

🔸 1. Ollama Setup

Initially, Ollama wasn’t recognized as a command ('ollama' not found).
We fixed this by setting up the system PATH variable manually and confirming Ollama’s active port on 11434.

🔸 2. Database Initialization

An SQL error (You can only execute one statement at a time) appeared when creating tables.
We solved it by executing schema statements separately using conn.execute(text()) in SQLAlchemy.

🔸 3. Frontend–Backend Integration

Our first Streamlit build couldn’t connect to FastAPI due to CORS restrictions.
Adding FastAPI’s CORS middleware resolved cross-origin request issues.

🔸 4. Port Conflicts

Ollama and Uvicorn both use local ports, causing “Only one usage of each socket address permitted” errors.
We used:

netstat -ano | findstr :11434  
taskkill /PID <PID> /F

### 🚧 5. Latency in First LLM Call

The first model load took around **25–30 seconds**.  
We optimized by **caching the model** at app startup to make subsequent responses faster and smoother.

---

## 🔍 Key Takeaways

- 🧠 **Offline AI is viable:** We successfully integrated a local LLM to deliver AI-powered responses **without internet dependency**.  
- ⚡ **Hybrid systems are powerful:** Combining FAQ retrieval with generative AI created a perfect balance between **accuracy and flexibility**.  
- 🧱 **Architecture matters:** Clean separation between backend, frontend, and data storage simplified **debugging, scaling, and testing**.  
- 💬 **Empathy in AI:** SupportIQ doesn’t just answer — it **converses, learns, and assists like a human**.

---

## 🧩 Future Improvements

- 🌍 **Multilingual Support:** Integrate translation models like **MarianMT** for regional and multilingual user bases.  
- 🗣️ **Voice Interface:** Add **speech-to-text** and **text-to-speech** capabilities for improved accessibility.  
- 📊 **Analytics Dashboard:** Visualize **query frequency**, **sentiment analysis**, and **escalation data** for insights.  
- ☁️ **Deployment Options:** Containerize the entire system using **Docker** for private, secure on-premise deployment.

Built With

  • built-with:-python
  • cosine-similarity
  • csv
  • fastapi
  • ollama-(llama-3.2:3b)
  • sqlalchemy
  • sqlite
  • streamlit
  • tf-idf-vectorization
  • uvicorn
Share this project:

Updates