Inspiration

The ShopNow Voice Agent was born out of the very practical issue of providing customer support in an e-commerce environment. In most e-commerce businesses, customers seek timely and accurate support, but the customer support team spends an enormous amount of time dealing with repetitive Tier 1 queries such as order tracking, return requests, payment issues, delivery complaints, and product-related queries. In the ShopNow brief, the brand faces a huge order volume in India with very limited human support capabilities, causing issues of long waiting times, inconsistent quality of support, and poor support during non-operational hours. We were asked to design an agent that could potentially reduce these issues by becoming the first line of support, especially in a multilingual environment where customers could use English, Hindi, or a combination of regional language patterns. Our inspiration was to build something that was far more than a simple chatbot. We were asked to build an AI voice customer support agent that was operationally viable, could understand context, fetch information, empathize with the customer, and understand when to escalate the issue to a human support agent.

What it does

The ShopNow Voice Agent is an AI-powered customer support agent designed to deal with repetitive customer support calls via voice. It can potentially listen to the customer, understand the context of the conversation, fetch information from the customer or the policy, and then respond accordingly in a very empathetic manner. The agent currently supports five major customer support categories: order status, return and refund, payment issues, delivery complaints, and product queries. In addition, the agent analyzes the emotional tone of the customer and identifies whether the customer is calm, upset, or angry. In addition, the agent escalates the conversation to a human support agent in the event that the issue becomes too sensitive or the customer asks for a human agent.

How we built it

To build the project, we created a modular full-stack AI system. We chose FastAPI for the backend. It supports both normal API calls and real-time WebSocket calls for voice conversation. For speech-related tasks like speech-to-text and text-to-speech, we integrated Sarvam AI. It is useful for an Indian multilingual support system. For the intelligence layer of the system, we integrated OpenAI for tasks like intent classification, sentiment analysis, response generation, call summaries, etc. To ensure the response is fact-based instead of being generic, we integrated the system with a SQLite database containing data like order, payment, seller, delivery, refund, etc. In addition to the above features, we integrated the system with a retrieval augmented generation pipeline using LangChain and FAISS for documents like return policy, shipping FAQ, payment FAQ, cancellation policy, product info, etc. Lastly, we integrated the system with a Streamlit dashboard to test the agent, view reports, view escalations, etc. This gave the project an end-to-end support system instead of a demo system.

Challenges we ran into

One of the biggest challenges we had to overcome was getting the system to behave like a support agent rather than a general-purpose AI assistant. In the realm of customer service, being fluent is not enough; the response must be informed by real order data and real company policy. Real-time voice orchestration was another challenge. Getting the system to work quickly enough to feel like a real conversation, from audio stream to response, is a complex challenge. Multilingual support is also a challenge. In a multilingual environment like India, the user may switch between English, Hindi, and Hinglish. However, the agent must be able to clearly comprehend the user's query. Escalation is another challenge we had to overcome. The AI must be aware of the user's frustration level and must be able to escalate the conversation smoothly. Building the agent for the user and the dashboard for the operations team added an extra layer of complexity. We wanted the system to be useful for the operations team beyond the demo viewers.

Accomplishments that we're proud of

We are proud of the fact that we have created a chatbot not only with a friendly interface, but also a chatbot that is integrated with real-time voice interaction, structured intent routing, order-aware responses, retrieval of support documents, sentiment-aware escalation, transcript logging, summarization, and analytics in one workflow. We are particularly proud of the fact-grounded design of this chatbot, as it is not only based on answers generated by a model, but it is also based on data from the order database and business policies from the retrieval layer. We are also particularly proud of the escalation feature of this chatbot, as it demonstrates that this project is responsible in its design. It is not a project that forces the AI to continue speaking through each and every issue, but it is designed in a way that it knows when it needs to escalate and give context to a human support agent. We are also particularly proud of the reporting and dashboard layer of this project, as it makes this project not only functional from a chatbot perspective, but also from an operations perspective. We are proud of this project as it directly deals with a real-world business problem and provides a real-world AI-first approach for a chatbot-based support system.

What we learned

We learned a number of things from this project. First and foremost, we learned that a great AI support system is not just a matter of plugging a language model into a UI. We learned that a great AI support system requires orchestration, context, business data, retrieval, emotion awareness, and clear escalation paths. We learned that grounding is crucial in a customer support context, since a response, no matter how friendly, is not only harmful but also operationally incorrect. We learned that voice UIs require a great deal of product discipline since responses need to be concise, clear, and natural-sounding. We learned that sentiment is not just a feature for analysis. We learned that sentiment is a critical factor in building trust and deciding whether to pass or fail. We learned that multilingual support in the real world is a systems problem, not a model problem. We learned that a great AI support system is one that works alongside a human support team, not in opposition to it.

What's next for ShopNow

The next step for ShopNow would be to take the strong MVP we have built and turn it into a production-ready support tool. We would like to take the strong support for multiple languages we have built and make it even better by increasing the number of languages supported and improving our ability to handle mixed language speech patterns. We would also like to move beyond the in-memory sessions we have been using and add more robust and production-ready sessions and monitoring tools. Another significant direction we would like to take ShopNow would be to integrate it more closely with CRM tools, ticketing tools, order management tools, and messaging tools like WhatsApp so that the agent can move beyond just answering customer questions and actually initiate support actions. In terms of intelligence, we would like to make personalization better, routing for escalations better, and handoff summaries better for human agents. In terms of operations, we would like to make reporting better, cluster unresolved issues better, track SLAs better, and measure satisfaction better after a call. Ultimately, we would like ShopNow to be an always-on, multi-language, AI-first support tool that reduces wait times for customers, enhances their overall experience, and scales customer support more intelligently.

Built With

Share this project:

Updates