Maestro AI

Inspiration

As students, we witnessed a frustrating "Status Quo": our peers were often paying $20/month for a single AI subscription that did not always fit their specific needs. Some models excel at math, others at creative writing, and others at real-time research.

We asked: Why should students be limited to one "brain" when they could have an entire orchestra? This inspired Maestro AI, a platform designed to route student queries to the most efficient and specialized model for the task.

What it does

Maestro AI is an intelligent model orchestrator that acts as a central "Conductor" for AI. Instead of manually switching between apps, users interact with a single interface. Maestro then uses a classification layer to route the prompt to the optimal specialist: STEM & Logic: Routed to GPT-4o. Essays & Polishing: Routed to Claude 3.5. Real-time Research: Routed to Perplexity Sonar. Quick Tasks: Routed to Llama 3.1 (to save user credits).

How we built it

Frontend: Developed using Lovable and React for a seamless student dashboard. Backend: Developed using Trae and powered by Supabase Edge Functions (TypeScript/Deno) to handle routing and security. AI Gateway: Integrated Keywords AI for multi-model API orchestration and real-time cost telemetry. Database: Supabase (Postgres) manages user credits, request logs, and our "Thread-based" memory system.

Challenges we ran into

The most significant technical challenge was Context Persistence. When Maestro switches from a Llama model to a Claude model mid-conversation, the "New" AI has no memory of the previous chat. We overcame this by building a custom Lookback Logic in Supabase that retrieves the last $$N$$ messages from the request_logs, represented as:$$M = { (role_i, content_i) \mid i \in [1, N] }$$We then "re-feed" this history into the next model to maintain a seamless user experience. We also faced steep learning curves with cloud deployment, particularly resolving 404 NOT_FOUND errors caused by deprecated model identifiers.

Accomplishments that we're proud of

We successfully implemented a "Savings Engine" that tracks the exact USD cost of every query using Keywords AI telemetry. By routing simple tasks to smaller, high-speed models, we reduced the average cost per query by over 60% compared to a standard GPT-4 call. Additionally, we are proud of our robust Row Level Security (RLS) in Supabase, which keeps every student's data 100% private.

What we learned

We learned the importance of stateless vs. stateful session management in AI applications. We also learned how to manage Token Trimming to prevent long conversations from exceeding a model's context window, ensuring that:$$T_{total} = T_{history} + T_{new_prompt} \le T_{limit}$$Where $$T_{total}$$ is the sum of history tokens and the new prompt tokens, which must remain below the model's specific context limit $$T_{limit}$$.