Inspiration

Traditional LLM APIs rely on a fixed model or require manual selection, often leading to suboptimal cost, latency, or performance. We wanted to automate and optimize model selection using ML techniques to make inference more efficient and cost-effective.

What it does

Adaptive dynamically selects the best LLM model for each query based on cost, speed, and accuracy. It:

  • Classifies user queries using an NVIDIA-accelerated ML model.
  • Routes requests to the most suitable LLM provider, such as Groq, OpenAI, or DeepSeek.
  • Provides an OpenAI-compatible API, requiring only three lines of code changes for developers.
  • Includes a chatbot frontend for easy interaction.

How we built it

  • Go Backend Service – Handles request routing and interfaces with LLM providers.
  • Python ML Service – Deploys NVIDIA-based classifiers to analyze and classify user queries.
  • Python SDK – Ensures seamless integration with existing OpenAI-based applications.
  • Chatbot Frontend – A user-friendly interface for interacting with Adaptive.

Challenges we ran into

  • Optimizing model selection to balance cost and performance.
  • Deploying fast and efficient ML classifiers for real-time query analysis.
  • Maintaining OpenAI API compatibility while supporting multiple LLM providers.

Accomplishments that we're proud of

  • Built a Go-based inference engine that seamlessly integrates with multiple LLM providers.
  • Developed an ML-powered model classifier that optimizes cost and accuracy.
  • Created an OpenAI-compatible SDK, enabling effortless adoption by developers.

What we learned

  • Fine-tuning NVIDIA-based classifiers for real-time decision-making.
  • Efficient request routing and load balancing for multi-model inference.
  • Designing an intuitive developer experience for seamless API adoption.

What's next for Adaptive

  • Expanding model support beyond existing providers.
  • Improving classification accuracy with more advanced ML techniques.
  • Optimizing inference speed through caching and preemptive routing.
  • Enhancing developer tools, including SDKs for more languages.

Built With

Share this project:

Updates