💡 SupportIQ – AI Customer Support Bot
“Smart, context-aware AI support that never sleeps — powered by local intelligence.”
🌟 Inspiration
Every business today struggles with repetitive customer queries — password resets, order tracking, refund requests — that take up valuable human time and delay real issue resolution.
While AI chatbots exist, most are cloud-dependent, expensive, and contextually shallow, unable to maintain conversation memory or escalate intelligently.
We wanted to change that.
The inspiration for SupportIQ came from a simple idea:
“Can we build a customer support bot that thinks like a human, runs offline, and costs nothing?”
Our goal was to design an AI-powered support assistant that could understand natural language, remember context, and provide meaningful help without needing an API key or constant internet connection.
That’s how SupportIQ was born — an offline, intelligent customer support chatbot powered by FastAPI, Streamlit, and Ollama’s Llama 3.2 (3B) model, built to make support faster, smarter, and more human.
⚙️ How We Built It
We approached the project with a modular architecture — dividing it into frontend, backend, and AI engine layers for easier scalability and testing.
🔹 Backend (FastAPI)
- Designed RESTful APIs for
/chat,/sessions, and/escalationsendpoints. - Implemented SQLite with SQLAlchemy ORM to manage conversation sessions and escalation logs.
- Built a hybrid response system:
- The bot first searches a FAQ dataset.
- If no relevant answer is found, the query is passed to Ollama’s local LLM (Llama 3.2 3B) for contextual response generation.
- The bot first searches a FAQ dataset.
- Integrated CORS Middleware to enable communication with the Streamlit frontend.
Mathematically, the FAQ retrieval process uses cosine similarity to find the most relevant response:
[ \text{Similarity}(Q_i, Q_j) = \frac{Q_i \cdot Q_j}{|Q_i| |Q_j|} ]
We set a confidence threshold ( \tau = 0.75 ), where any similarity score below ( \tau ) triggers an LLM-based response instead of an FAQ match.
🔹 Frontend (Streamlit)
- Created a clean, chat-style interface with message history and typing simulation.
- Connected to the backend via REST endpoints using
requests.post()calls. - Managed session tracking through Streamlit’s
st.session_stateto preserve conversation flow.
🔹 LLM Integration (Ollama)
- Deployed Llama 3.2:3B locally using Ollama to generate replies when FAQs fail.
- Configured response summarization and next-action suggestions directly from LLM output.
- Runs fully offline, ensuring zero cost and full data privacy.
🔹 Dataset
- Built a local dataset (
faq.sample.csv) of 50+ common support queries and answers. - Designed the retrieval logic with TF-IDF Vectorization and cosine similarity.
🧠 What We Learned
Building SupportIQ was a crash course in applied AI engineering — merging backend APIs, frontend design, and local AI modeling.
We learned:
- How to integrate local LLMs (Ollama) for offline, cost-free AI solutions.
- To use FastAPI for building scalable and asynchronous backends.
- To handle session memory and database persistence for real-world chat applications.
- The importance of hybrid intelligence — combining deterministic FAQ systems with generative AI for flexibility.
- And most importantly, how to design a user experience that feels natural, responsive, and human.
This project strengthened our understanding of both LLM lifecycle management and full-stack AI system design.
🚧 Challenges We Faced
Like every good hackathon project, SupportIQ came with its fair share of bugs, errors, and sleepless nights.
🔸 1. Ollama Setup
Initially, Ollama wasn’t recognized as a command ('ollama' not found).
We fixed this by setting up the system PATH variable manually and confirming Ollama’s active port on 11434.
🔸 2. Database Initialization
An SQL error (You can only execute one statement at a time) appeared when creating tables.
We solved it by executing schema statements separately using conn.execute(text()) in SQLAlchemy.
🔸 3. Frontend–Backend Integration
Our first Streamlit build couldn’t connect to FastAPI due to CORS restrictions.
Adding FastAPI’s CORS middleware resolved cross-origin request issues.
🔸 4. Port Conflicts
Ollama and Uvicorn both use local ports, causing “Only one usage of each socket address permitted” errors.
We used:
netstat -ano | findstr :11434
taskkill /PID <PID> /F
### 🚧 5. Latency in First LLM Call
The first model load took around **25–30 seconds**.
We optimized by **caching the model** at app startup to make subsequent responses faster and smoother.
---
## 🔍 Key Takeaways
- 🧠 **Offline AI is viable:** We successfully integrated a local LLM to deliver AI-powered responses **without internet dependency**.
- ⚡ **Hybrid systems are powerful:** Combining FAQ retrieval with generative AI created a perfect balance between **accuracy and flexibility**.
- 🧱 **Architecture matters:** Clean separation between backend, frontend, and data storage simplified **debugging, scaling, and testing**.
- 💬 **Empathy in AI:** SupportIQ doesn’t just answer — it **converses, learns, and assists like a human**.
---
## 🧩 Future Improvements
- 🌍 **Multilingual Support:** Integrate translation models like **MarianMT** for regional and multilingual user bases.
- 🗣️ **Voice Interface:** Add **speech-to-text** and **text-to-speech** capabilities for improved accessibility.
- 📊 **Analytics Dashboard:** Visualize **query frequency**, **sentiment analysis**, and **escalation data** for insights.
- ☁️ **Deployment Options:** Containerize the entire system using **Docker** for private, secure on-premise deployment.
Built With
- built-with:-python
- cosine-similarity
- csv
- fastapi
- ollama-(llama-3.2:3b)
- sqlalchemy
- sqlite
- streamlit
- tf-idf-vectorization
- uvicorn
Log in or sign up for Devpost to join the conversation.