In today’s world, AI is everywhere, but so are concerns about data privacy. People constantly wonder: Is my data safe? Could it leak? We wanted to create a way to use powerful AI without sending data to the cloud.

What it does

Pocket LLM lets you chat with powerful AI models like Llama, Gemma, DeepSeek, Apple Intelligence, and Qwen directly on your device. No internet, no account, no data sharing. Just fast, private AI powered by Apple MLX. • Works fully offline anywhere • No login, no data collection • Optimized for Apple Silicon for blazing speed • Supports multiple open models • Chat, write, and analyze instantly

How we built it

We used Apple MLX to run large language models locally on Apple Silicon devices. The app loads quantized models, optimizes memory usage, and provides a clean UI for seamless offline interaction.

Challenges we ran into • Optimizing large models to fit within memory limits • Maintaining fast inference speed without overheating devices • Designing a lightweight UI that feels simple but powerful • Ensuring multi-model support without complexity

Accomplishments that we’re proud of • Running multiple LLMs entirely offline on-device • Delivering a smooth chat experience without cloud servers • Creating a tool that respects privacy and ownership of data • Supporting a range of models with minimal setup

What we learned • On-device AI is more feasible than expected with Apple Silicon • Privacy-first design resonates strongly with users • Model quantization and optimization are key to performance

What’s next for Pocket LLM • Add voice input and text-to-speech for natural conversations • Expand support for more models and languages • Build a plugin system for tasks like summarization, coding help, and note-taking • Explore collaborative offline apps that run on local AI

Built With

  • appleintelligence
  • foundationmodels
  • mlxswift
  • swiftui
Share this project:

Updates