Inspiration

AutoGPT is cool and all, but it just writes text. How can we get it to actually take action and do all of it’s grand ideas?

All of our favorite tools have APIs these days, so technically, one could read the docs, build the integration, and do the thing - but it’s 2023 and we know better is possible.

We imagine a world where API documentation is automatically ingested in a way language models know how to interact with, and once users have connected their favorite apps, they can begin to use any website or app with natural language.

GPT-4 shows amazing potential at self-organizing its own goals, executing actions, and getting stuff done.

With a bit of inspiration from its human partners, we imagine complex workflows being autonomously executed with REAL services and REAL results, not just a description of it. What it does

FlyDocs reads and comprehends any service with an API by analyzing its documentation, similar to how a developer would.

After that, FlyDocs provides you with the capability to log in to popular services to make the process of using and storing API keys more convenient.

Once your preferred apps are connected, the fun starts. Simply describe what you want to achieve in plain language. We use an agent on LangChain to recognize, generate, and affirm the API calls to implement your objectives.

Flydocs can also ingest any application documentation and answer knowledge questions such as “how to create a bot for Discord”, or “how can I use Twilio to send a text message”. Using Berri.ai, Flydocs can transform any documentation into a simple Q&A chatbot!

You can send a message, schedule a meeting, play some tunes on Spotify, or share something interesting with your friends on Discord. 🚀

How we built it

FlyDocs is comprised of 3 main components:

API Ingestion When OpenAPI spec files describing a service exist, our life is easy - we can have LangChain ingest them to understand how to interact with the service. When that’s not the case, we used GPT-4 to make its own API Spec files for popular services, such as Discord and Twilio.

Program Execution The execution of the human goal is currently done by an “OpenAPI Agent” from LangChain. This agent takes in the relevant parts of the OpenAPI spec to understand how to call APIs, and then does so.

Web Experience Making an experience for both technical and non-technical audiences alike is key for the adoption we are hoping to get. We deployed a Next.js application using Repl.it and Vercel, along with a Python Flask Application hosted in Repl.it that acts as our API with LangChain.

We use a startup called nango.dev to get a head start on allowing users to integrate with us.

Challenges we ran into

At times, we were unable to locate the OpenAPI specifications for certain applications that piqued our interest. For example, Discord does not adhere to the OpenAPI standard. Fortunately, we were able to resolve this issue by reaching out to chatGPT and requesting that it generates the specifications for us, which it promptly did! 🚀

The OpenAPI LangChain agent was also a bit of a blackbox for us. It does not seem to use embeddings of the endpoints to dynamically load in relevant endpoints, so for services with massive OpenAPI spec files, like GitHub, these errored out due to inefficient use of the context window. We will need to re-imagine the OpenAPI LangChain Agent and enhance it with embedding generation.

Accomplishments that we're proud of

We have successfully developed a fully functional MVP! 🚀 Multiple use cases were tested and we are proud to share some examples of NLP queries possible with FlyDocs:

  • Spotify integration: "Hey Flydocs, add another one bites the dust by queen to my queue"-
  • Twilio integration: "Hey Flydocs, send an sms to +xx xxx xxxxx saying happy birthday"
  • Discord integration: "Hey Flydocs, send to the hackathon channel "final pushes, only one hour to go! happy coding"

What we learned

A lot can be done in a few hours and the world of AI-powered apps is quickly expanding 🚀

What's next for FlyDocs

We are excited to add more integrations, test our product with real customers and explore its limits 🚀

Built With

  • berri.ai
  • flask
  • langchain
  • next.js
Share this project:

Updates