Orkestra

demo

Inspiration

🚀 Orkestra Gemini — Intelligent Model Routing for AI Apps

Build Gemini-powered AI apps without worrying about which model to call.

Orkestra Gemini intelligently routes each API request to the most cost-efficient Gemini model based on the prompt — maximizing performance and savings. :contentReference[oaicite:1]{index=1}

🧠 Why It Matters for Developers

Traditional AI integrations send every request to a single general model — even simple ones like math, extraction, or formatting — which drives up cost and latency.

With Orkestra:

🪄 One integration: Install once, call like normal
📊 Automatic routing: KNN-based router chooses the most appropriate Gemini model for each prompt
💸 Smart cost optimization: Cheaper models handle lightweight tasks, premium models handle complex thinking
🔁 Zero code rewrites: Your app code stays unchanged — only smarter under the hood :contentReference[oaicite:2]{index=2}

🛠 How It Works

Prompt is embedded using a lightweight encoder
Router finds the closest training examples
It selects the best Gemini tier for your prompt
Calls that model and returns results with actual cost & savings info :contentReference[oaicite:3]{index=3}

Example:

import orkestra

client = orkestra.Client(key="YOUR_GEMINI_API_KEY")
response = client.generate("Explain quantum computing")

print(response.text)
print(f"Model used: {response.model}")
print(f"Cost: ${response.cost:.6f}")
print(f"Savings vs base: {response.savings_percent:.1f}%")
``` :contentReference[oaicite:4]{index=4}

### 📈 What You Get

- 🚀 Faster feedback loops  
- 🧮 Cost transparency  
- 🪙 Real savings on Gemini bills  
- ⚙️ Same API surface developers already expect :contentReference[oaicite:5]{index=5}