Inspiration

The inspiration for LocalLogic is the "Token Tax"—the economic crisis facing modern developers. We realized that cloud-native tools like Cursor and Bolt charge users for every token, leading to monthly bills as high as $2,595 for heavy users. We asked: Why rent intelligence when we have powerful NPUs in our pockets? Our goal is to move the "thinking" process from the cloud to the device silicon.

What it does

LocalLogic is designed as a Privacy-First, Offline-Native AI Agent. It is an architectural blueprint for a "Logic Sidecar" that runs entirely on mobile hardware. • Sovereign Logic: It utilizes DeepSeek-R1-Distill models running locally to reason through code without internet access. • The "Logic Trace": Inspired by Kiro's spec-driven development, the app captures the model's tags to auto-generate a LOGIC_TRACE.md, ensuring code changes are documented and audit-proof. • Zero-Cost Coding: By running on-device, it reduces the marginal cost of AI assistance to $0.00, eliminating the "context bloat" that plagues cloud tools

How we built it

We have designed a feasibility architecture based on three verified technologies:

  1. Inference Engine: We will utilize the RunAnywhere SDK (iOS/Android). This SDK provides the necessary infrastructure to run GGUF models like DeepSeek-R1-Distill-Qwen-1.5B on-device, handling the complex memory management required to prevent OS termination.
  2. Context Management: Instead of full-repo RAG, we will implement Single-File Vectorization using VecturaKit (iOS) or FAISS. This allows us to index only the active file's variables and functions for sub-100ms retrieval latency.
  3. Data Layer: The app operates as a "Thin Client" for GitHub. We will use the GitHub REST API (specifically the POST /git/blobs endpoints) to fetch raw file strings and construct commits programmatically, bypassing heavy git clone operations

Challenges we ran into

• The "Memory Wall": Mobile OSs kill background processes that use too much RAM. ◦ Planned Solution: We will use the RunAnywhere SDK's Smart Memory feature to dynamically unload model layers during high pressure. • Model Accuracy: Can a small 1.5B model actually code? ◦ Feasibility Check: According to DeepSeek benchmarks, the R1-Distill-1.5B model achieves 83.9% on MATH-500, outperforming larger models like GPT-4o in specific reasoning tasks. This proves a small model is sufficient for logic fixes. • Background Persistence: Keeping the AI "thinking" when the screen is locked. ◦ Planned Solution: We will implement iOS BGTaskScheduler and Android Foreground Services to negotiate execution time with the OS for long-running inference.

Accomplishments that we're proud of

• Validated Architecture: We successfully mapped the DeepSeek-R1 reasoning capabilities to mobile constraints, confirming that 4-bit quantized models (~1.2GB) fit comfortably within the RAM envelope of modern devices like the iPhone 16 or Pixel 9. • Economic Viability: We demonstrated a theoretical cost reduction of 99% for daily coding tasks by shifting from Cloud APIs (approx 15/Moutputtokens)tolocalsilicon

What we learned

• Size Isn't Everything: A 1.5B parameter model trained on reasoning data (Chain-of-Thought) outperforms larger generic models (like GPT-4o) on specific logical tasks. • Context is Expensive: 90% of daily coding tasks (bug fixes, small refactors) only require the context of a single file. Sending the entire repo to the cloud is a massive economic inefficiency. • The Mobile NPU is Ready: Modern chips like the Snapdragon 8 Elite and Apple A18 Pro are now capable of running agentic workloads that were previously restricted to data centers

What's next for LocalLogic

• Prototype Development: Integrating the RunAnywhere SDK into a React Native or Swift shell to prove the "first token" latency is under 200ms. • Spec-Driven Workflow: Implementing the Kiro-style "Hooks" system to automatically trigger tests when the local AI generates code

Built With

  • github
  • inference
  • kotlin
  • on-device
  • rest
  • runanywheresdk
  • swift
  • vecturakit
Share this project:

Updates