LocalAI+

Inspiration

Cloud AI APIs like OpenAI include critical extras — chat formatting, function calling, embeddings, and tools — that local models lack. We wanted to bridge that gap and empower local-first AI development.

What it does

LocalAI+ is an OpenAI-compatible API wrapper for local LLMs. It adds chat formatting, function calling, embeddings, secure tool use, and RAG — all locally, with zero cloud dependencies.

How we built it

We used Python and FastAPI to build the backend, Ollama to serve local LLMs, Qdrant for vector search, and added a plugin system for tools and function calling. All wrapped in an OpenAI-style interface with full API docs.

Challenges we ran into

Making function calling robust and schema-compliant
Handling edge cases in local model output
Safely executing code and tools in a sandboxed environment
Designing a clean, modular plugin architecture that works out of the box

Accomplishments that we're proud of

Fully OpenAI-compatible local API
Function calling and embedding support with zero cloud
Secure sandboxed code execution
Plug-and-play architecture for adding new tools

What we learned

Local models are powerful but need orchestration to be useful
Developers want open, local-first APIs — but simplicity is critical
Rebuilding cloud-level infra locally is hard, but incredibly rewarding

What's next for LocalAI+

Agent memory and threading
API key-based auth
Web-based dev playground
Prebuilt tool library for plug-and-play LLM apps
Model backend switching and load balancing

Built With

docker
fastapi
llama
nomic-embed-text
ollama
openapi
pyodide
python
qdrant

Updates

E.A. Mateli posted an update — Jun 30, 2025 04:32 PM EDT

I was working "In the shadows" but will be posting future updates regarding the Chat function"

Log in or sign up for Devpost to join the conversation.

E.A. Mateli started this project — Jun 30, 2025 04:29 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.