Inspiration
We realised that while humans can read between the lines of messy documentation, AI agents cannot. We saw developers building incredible autonomous agents, only for them to fail because an API didn't support idempotency, had ambiguous field names, or lacked strict type enforcement. We wanted to build a "Lighthouse" for the Agentic Era, a tool that tells you exactly how ready your endpoint is to be handled by an AI.
What it does
RateMyAPI is an automated auditing tool that generates an Agentic AI Readiness Score for any API endpoint. It doesn't just check if your code runs; it evaluates if an LLM can understand, safely interact with, and reliably recover from errors within your API. It bridges the gap between traditional REST standards and the new requirements of autonomous function calling.
How we built it
Agentic Readiness Score: A comprehensive 0-10 grade of your API's AI-friendliness.
Semantic Clarity Audit: Uses LLMs to analyse if your endpoint and parameter names are descriptive enough for an agent to pick the right tool.
Idempotency Verification: Checks if the API supports Idempotency-Key headers—a critical requirement for agents that might retry a request after a timeout to prevent duplicate actions (like double-billing).
Type Accuracy & Strictness: Evaluates how well the API enforces schemas to prevent "hallucinated" data types from breaking the backend.
Documentation "Friendliness": Scans descriptions to ensure they provide enough context for an LLM to understand when and why to call the endpoint.
Challenges we ran into
Quantifying "Clarity": "Clarity" is subjective. We overcame this by creating a "Confusion Matrix" for the LLM—asking it to rank how many similar-sounding endpoints might cause a "tool-selection collision."
Testing Idempotency: It's hard to verify idempotency without side effects. We had to build a "Mock-Sandbox" mode to simulate repeated requests and analyze header responses safely.
Weighted Scoring: Deciding if Idempotency is more important than Type Accuracy was a debate. We eventually settled on a "Safety-First" weighting where reliability features carry more weight than naming conventions.
Accomplishments that we're proud of
The Idempotency Validator: We successfully built a testing suite that doesn't just check for a header, but validates if an API can safely handle retries without side effects—a "Stripe-level" standard that is surprisingly rare in most modern APIs.
Zero-Config Infrastructure: We’re proud of our deployment pipeline. Moving from local development to a production-ready environment on DigitalOcean protected by Cloudflare was a major milestone that ensured our scoring engine could handle real-world traffic safely.
The "Agentic Lens": We developed a unique scoring algorithm that views documentation through the "eyes" of an LLM. Seeing the system correctly identify "ambiguous" parameter names that would cause an AI to fail was a true "Aha!" moment.
Hackathon Endurance: Going from a "RateMyAPI" concept to a functional auditor that can ingest an OpenAPI spec and output a weighted score in under 36 hours.
What we learned
The "LX" (LLM Experience) Gap: We learned that an API can be "perfect" for a human developer but a total nightmare for an AI agent. Small things like missing Idempotency-Key support or vague field names like data1 are the silent killers of autonomous agents.
Security at the Edge: Setting up Cloudflare taught us the importance of protecting AI-facing infrastructure from bot-driven "hallucination attacks" and ensuring high availability for critical API audits.
The Importance of Determinism: We learned how to balance the creative reasoning of an LLM (for semantic clarity) with deterministic code (for type and protocol checks) to create a score that feels both "smart" and technically accurate.
DevOps on the Fly: Scaling our backend on DigitalOcean taught us the value of containerized workflows and how to manage environment variables securely for third-party API integrations.
What's next for RateMyAPI
MCP Integration: Automatically generating a Model Context Protocol (MCP) server for any API that scores above an 80.
Auto-Remediation: A "Fix my API" button that suggests better names and adds boilerplate code for idempotency headers.
CI/CD Plugin: A GitHub Action that fails a build if a PR drops the Agentic Readiness Score below a certain threshold.
Historical Data: Logs the score report for every call and shows historic data for APIs from its root.
Log in or sign up for Devpost to join the conversation.