💡 SupportIQ – AI Customer Support Bot

“Smart, context-aware AI support that never sleeps — powered by local intelligence.”

🌟 Inspiration

Every business today struggles with repetitive customer queries — password resets, order tracking, refund requests — that take up valuable human time and delay real issue resolution.
While AI chatbots exist, most are cloud-dependent, expensive, and contextually shallow, unable to maintain conversation memory or escalate intelligently.

We wanted to change that.
The inspiration for SupportIQ came from a simple idea:

“Can we build a customer support bot that thinks like a human, runs offline, and costs nothing?”

Our goal was to design an AI-powered support assistant that could understand natural language, remember context, and provide meaningful help without needing an API key or constant internet connection.

That’s how SupportIQ was born — an offline, intelligent customer support chatbot powered by FastAPI, Streamlit, and Ollama’s Llama 3.2 (3B) model, built to make support faster, smarter, and more human.

⚙️ How We Built It

We approached the project with a modular architecture — dividing it into frontend, backend, and AI engine layers for easier scalability and testing.

🔹 Backend (FastAPI)

Designed RESTful APIs for /chat, /sessions, and /escalations endpoints.
Implemented SQLite with SQLAlchemy ORM to manage conversation sessions and escalation logs.
Built a hybrid response system:
- The bot first searches a FAQ dataset.
- If no relevant answer is found, the query is passed to Ollama’s local LLM (Llama 3.2 3B) for contextual response generation.
Integrated CORS Middleware to enable communication with the Streamlit frontend.

Mathematically, the FAQ retrieval process uses cosine similarity to find the most relevant response:

[ \text{Similarity}(Q_i, Q_j) = \frac{Q_i \cdot Q_j}{|Q_i| |Q_j|} ]

We set a confidence threshold ( \tau = 0.75 ), where any similarity score below ( \tau ) triggers an LLM-based response instead of an FAQ match.

🔹 Frontend (Streamlit)

Created a clean, chat-style interface with message history and typing simulation.
Connected to the backend via REST endpoints using requests.post() calls.
Managed session tracking through Streamlit’s st.session_state to preserve conversation flow.

🔹 LLM Integration (Ollama)

Deployed Llama 3.2:3B locally using Ollama to generate replies when FAQs fail.
Configured response summarization and next-action suggestions directly from LLM output.
Runs fully offline, ensuring zero cost and full data privacy.

🔹 Dataset

Built a local dataset (faq.sample.csv) of 50+ common support queries and answers.
Designed the retrieval logic with TF-IDF Vectorization and cosine similarity.

🧠 What We Learned

Building SupportIQ was a crash course in applied AI engineering — merging backend APIs, frontend design, and local AI modeling.

We learned:

How to integrate local LLMs (Ollama) for offline, cost-free AI solutions.
To use FastAPI for building scalable and asynchronous backends.
To handle session memory and database persistence for real-world chat applications.
The importance of hybrid intelligence — combining deterministic FAQ systems with generative AI for flexibility.
And most importantly, how to design a user experience that feels natural, responsive, and human.

This project strengthened our understanding of both LLM lifecycle management and full-stack AI system design.

🚧 Challenges We Faced

Like every good hackathon project, SupportIQ came with its fair share of bugs, errors, and sleepless nights.

🔸 1. Ollama Setup

Initially, Ollama wasn’t recognized as a command ('ollama' not found).
We fixed this by setting up the system PATH variable manually and confirming Ollama’s active port on 11434.

🔸 2. Database Initialization

An SQL error (You can only execute one statement at a time) appeared when creating tables.
We solved it by executing schema statements separately using conn.execute(text()) in SQLAlchemy.

🔸 3. Frontend–Backend Integration

Our first Streamlit build couldn’t connect to FastAPI due to CORS restrictions.
Adding FastAPI’s CORS middleware resolved cross-origin request issues.

🔸 4. Port Conflicts

Ollama and Uvicorn both use local ports, causing “Only one usage of each socket address permitted” errors.
We used:

netstat -ano | findstr :11434  
taskkill /PID <PID> /F

### 🚧 5. Latency in First LLM Call

The first model load took around **25–30 seconds**.  
We optimized by **caching the model** at app startup to make subsequent responses faster and smoother.

---

## 🔍 Key Takeaways

- 🧠 **Offline AI is viable:** We successfully integrated a local LLM to deliver AI-powered responses **without internet dependency**.  
- ⚡ **Hybrid systems are powerful:** Combining FAQ retrieval with generative AI created a perfect balance between **accuracy and flexibility**.  
- 🧱 **Architecture matters:** Clean separation between backend, frontend, and data storage simplified **debugging, scaling, and testing**.  
- 💬 **Empathy in AI:** SupportIQ doesn’t just answer — it **converses, learns, and assists like a human**.

---

## 🧩 Future Improvements

- 🌍 **Multilingual Support:** Integrate translation models like **MarianMT** for regional and multilingual user bases.  
- 🗣️ **Voice Interface:** Add **speech-to-text** and **text-to-speech** capabilities for improved accessibility.  
- 📊 **Analytics Dashboard:** Visualize **query frequency**, **sentiment analysis**, and **escalation data** for insights.  
- ☁️ **Deployment Options:** Containerize the entire system using **Docker** for private, secure on-premise deployment.

Built With

built-with:-python
cosine-similarity
csv
fastapi
ollama-(llama-3.2:3b)
sqlalchemy
sqlite
streamlit
tf-idf-vectorization
uvicorn

Updates

gulbhatnagarr bhatnagar started this project — Nov 04, 2025 03:08 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.