Inspiration
Cloud AI APIs like OpenAI include critical extras — chat formatting, function calling, embeddings, and tools — that local models lack. We wanted to bridge that gap and empower local-first AI development.
What it does
LocalAI+ is an OpenAI-compatible API wrapper for local LLMs. It adds chat formatting, function calling, embeddings, secure tool use, and RAG — all locally, with zero cloud dependencies.
How we built it
We used Python and FastAPI to build the backend, Ollama to serve local LLMs, Qdrant for vector search, and added a plugin system for tools and function calling. All wrapped in an OpenAI-style interface with full API docs.
Challenges we ran into
- Making function calling robust and schema-compliant
- Handling edge cases in local model output
- Safely executing code and tools in a sandboxed environment
- Designing a clean, modular plugin architecture that works out of the box
Accomplishments that we're proud of
- Fully OpenAI-compatible local API
- Function calling and embedding support with zero cloud
- Secure sandboxed code execution
- Plug-and-play architecture for adding new tools
What we learned
- Local models are powerful but need orchestration to be useful
- Developers want open, local-first APIs — but simplicity is critical
- Rebuilding cloud-level infra locally is hard, but incredibly rewarding
What's next for LocalAI+
- Agent memory and threading
- API key-based auth
- Web-based dev playground
- Prebuilt tool library for plug-and-play LLM apps
- Model backend switching and load balancing
Log in or sign up for Devpost to join the conversation.