oai | Devpost

Inspiration

We noticed that many small businesses and students struggle with accessing accurate, context-aware answers to niche questions—whether it’s troubleshooting a Shopify theme, solving a college calculus problem, or drafting a small-business tax template. Generic AI tools often give overly broad responses, while specialized tools are too expensive or require steep learning curves. We wanted to build a lightweight, fast, and affordable AI assistant that could deliver precise answers for specific use cases without the bloat of larger models.

What it does

OAI is a modular AI assistant that combines:

Retrieval-Augmented Generation (RAG): Pulls verified, up-to-date information from domain-specific knowledge bases (e.g., Shopify developer docs, university math textbooks, IRS tax guides).
Fine-tuned small language models (SLMs): Runs locally on low-power devices or via a lightweight cloud API, ensuring fast response times (<2s) and low latency.
Customizable workflows: Users can create "skill packs" (e.g., "Shopify Troubleshooter" or "Calculus Solver") to tailor OAI to their exact needs.
Multi-format output: Supports plain text, LaTeX for math equations, and Markdown for technical documentation.

How we built it

Tech Stack:
- Backend: Python + FastAPI for the API, LangChain for RAG pipeline, Llama 3 8B (quantized) as the core LLM.
- Frontend: React + TypeScript for the web interface, with a mobile-friendly PWA wrapper.
- Data Pipeline: Pinecone for vector storage, BeautifulSoup for scraping domain-specific docs, and Hugging Face Transformers for embedding generation.
Key Steps:
- Curated 3 niche knowledge bases (e-commerce, education, small business) and converted them into vector embeddings.
- Built a modular RAG system that dynamically routes queries to the most relevant knowledge base.
- Optimized the LLM for edge deployment using GGUF quantization, reducing model size by 70% without significant quality loss.
- Implemented a feedback loop where user corrections refine the RAG retrieval accuracy over time.

Challenges we ran into

Latency vs. Accuracy: Balancing fast response times with precise answers required us to experiment with different embedding models and vector DB configurations—we eventually settled on BAAI/bge-small-en-v1.5 for embeddings, which cut latency by 40% while maintaining 92% retrieval accuracy.
Edge Deployment: Getting the quantized LLM to run reliably on both macOS and Windows devices required resolving compatibility issues with the llama.cpp runtime.
Knowledge Base Maintenance: Ensuring our domain-specific docs stayed up-to-date required building an automated scraper with change detection, which took longer than expected due to anti-scraping measures on some sites.
User Onboarding: Making the "skill pack" customization intuitive for non-technical users required several rounds of user testing and UI tweaks.

Accomplishments that we're proud of

Performance: Achieved sub-2s response times for 95% of queries, even with RAG enabled.
Accuracy: Our RAG pipeline improved answer correctness by 38% compared to the base LLM alone, measured via human evaluation.
Accessibility: The edge deployment option means users can run OAI offline, which is critical for students and small business owners with limited internet access.
User Feedback: Early testers (12 small business owners and 8 students) reported a 65% reduction in time spent on niche tasks after using OAI.

What we learned

Modularity beats monoliths: Building the system as interchangeable modules (RAG, LLM, frontend) made it easier to debug and iterate on individual components.
Quantization is a game-changer: We didn’t realize how effective 4-bit quantization would be for reducing model size while preserving quality—this opened up new possibilities for edge deployment.
User testing is non-negotiable: Our initial UI for skill packs was too technical, and only after 3 rounds of user feedback did we land on a drag-and-drop interface that worked for non-developers.
Legal and ethical considerations: Scraping public docs required careful review of robots.txt and terms of service to avoid copyright issues.

What's next for OAI

Expand Skill Packs: Add pre-built packs for healthcare billing, freelance contract drafting, and high-school physics.
Multi-language Support: Localize the model and knowledge bases for Spanish and French, targeting users in Latin America and Europe.
API Marketplace: Let developers build and sell custom skill packs, creating an ecosystem around OAI.
Mobile App: Release a native iOS/Android app to make offline access even more seamless.
Enterprise Plan: Offer a self-hosted version for companies that need to keep data on-premises, with SSO and audit logs.

Built With

python

Updates

Kim Lao started this project — Feb 09, 2026 01:12 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.