Inspiration
We noticed that many small businesses and students struggle with accessing accurate, context-aware answers to niche questions—whether it’s troubleshooting a Shopify theme, solving a college calculus problem, or drafting a small-business tax template. Generic AI tools often give overly broad responses, while specialized tools are too expensive or require steep learning curves. We wanted to build a lightweight, fast, and affordable AI assistant that could deliver precise answers for specific use cases without the bloat of larger models.
What it does
OAI is a modular AI assistant that combines:
- Retrieval-Augmented Generation (RAG): Pulls verified, up-to-date information from domain-specific knowledge bases (e.g., Shopify developer docs, university math textbooks, IRS tax guides).
- Fine-tuned small language models (SLMs): Runs locally on low-power devices or via a lightweight cloud API, ensuring fast response times (<2s) and low latency.
- Customizable workflows: Users can create "skill packs" (e.g., "Shopify Troubleshooter" or "Calculus Solver") to tailor OAI to their exact needs.
- Multi-format output: Supports plain text, LaTeX for math equations, and Markdown for technical documentation.
How we built it
- Tech Stack:
- Backend: Python + FastAPI for the API, LangChain for RAG pipeline, Llama 3 8B (quantized) as the core LLM.
- Frontend: React + TypeScript for the web interface, with a mobile-friendly PWA wrapper.
- Data Pipeline: Pinecone for vector storage, BeautifulSoup for scraping domain-specific docs, and Hugging Face Transformers for embedding generation.
- Key Steps:
- Curated 3 niche knowledge bases (e-commerce, education, small business) and converted them into vector embeddings.
- Built a modular RAG system that dynamically routes queries to the most relevant knowledge base.
- Optimized the LLM for edge deployment using GGUF quantization, reducing model size by 70% without significant quality loss.
- Implemented a feedback loop where user corrections refine the RAG retrieval accuracy over time.
Challenges we ran into
- Latency vs. Accuracy: Balancing fast response times with precise answers required us to experiment with different embedding models and vector DB configurations—we eventually settled on
BAAI/bge-small-en-v1.5for embeddings, which cut latency by 40% while maintaining 92% retrieval accuracy. - Edge Deployment: Getting the quantized LLM to run reliably on both macOS and Windows devices required resolving compatibility issues with the
llama.cppruntime. - Knowledge Base Maintenance: Ensuring our domain-specific docs stayed up-to-date required building an automated scraper with change detection, which took longer than expected due to anti-scraping measures on some sites.
- User Onboarding: Making the "skill pack" customization intuitive for non-technical users required several rounds of user testing and UI tweaks.
Accomplishments that we're proud of
- Performance: Achieved sub-2s response times for 95% of queries, even with RAG enabled.
- Accuracy: Our RAG pipeline improved answer correctness by 38% compared to the base LLM alone, measured via human evaluation.
- Accessibility: The edge deployment option means users can run OAI offline, which is critical for students and small business owners with limited internet access.
- User Feedback: Early testers (12 small business owners and 8 students) reported a 65% reduction in time spent on niche tasks after using OAI.
What we learned
- Modularity beats monoliths: Building the system as interchangeable modules (RAG, LLM, frontend) made it easier to debug and iterate on individual components.
- Quantization is a game-changer: We didn’t realize how effective 4-bit quantization would be for reducing model size while preserving quality—this opened up new possibilities for edge deployment.
- User testing is non-negotiable: Our initial UI for skill packs was too technical, and only after 3 rounds of user feedback did we land on a drag-and-drop interface that worked for non-developers.
- Legal and ethical considerations: Scraping public docs required careful review of robots.txt and terms of service to avoid copyright issues.
What's next for OAI
- Expand Skill Packs: Add pre-built packs for healthcare billing, freelance contract drafting, and high-school physics.
- Multi-language Support: Localize the model and knowledge bases for Spanish and French, targeting users in Latin America and Europe.
- API Marketplace: Let developers build and sell custom skill packs, creating an ecosystem around OAI.
- Mobile App: Release a native iOS/Android app to make offline access even more seamless.
- Enterprise Plan: Offer a self-hosted version for companies that need to keep data on-premises, with SSO and audit logs.
Log in or sign up for Devpost to join the conversation.