Inspiration
Our project's spark ignited from a frustration with navigating the landscape of Large Language Models (LLMs). Identifying the optimal model for specific tasks and prompt styles took a lot of work. We sought to eliminate the trial and error of testing every prompt with multiple LLMs, aiming to create a unified platform that simplifies this process, making AI more accessible and tailored to individual needs.
What it does
Our platform serves as a comparative tool that evaluates the performance of various LLMs against user-provided prompts, offering insights into which model best aligns with their prompting style. It's designed not to rank these models by their inherent capabilities but to guide users towards the one that resonates with their unique approach to prompting. Additionally, it categorizes prompts based on domains, further personalizing the recommendation process. Future functionalities will include prompt optimization and template guidance to enhance user interaction with AI.
How we built it
We leveraged AWS technologies, utilizing Lambda functions for backend processing and API Gateway for seamless interaction between the front end and our serverless backend. The front end was developed with Bootstrap for a responsive and intuitive user interface. At the core of our solution, we utilized Bedrock APIs to interface with prominent LLMs like Claude, LLAMA, and Titan, enabling us to broadly compare model performances.
Challenges we ran into
Our team initially faced hurdles with the technical aspects of AWS and the complex nature of working with multiple LLMs. The learning curve was steep, as we had to familiarize ourselves with various technologies and the principles of prompt engineering to effectively build our platform.
Accomplishments that we're proud of
We are particularly proud of creating a user-friendly platform that demystifies selecting the right LLM for specific needs. We overcame technical challenges and successfully integrated multiple LLMS into a cohesive solution.
What we learned
Throughout this journey, we gained invaluable insights into the nuances of prompt engineering and the significant impact of tailored prompts on the efficacy of LLM interactions. Our exploration into the world of LLMs deepened our understanding of AWS services and the diverse capabilities of different LLMs. This knowledge was crucial in laying the foundation for our platform. While we have yet to implement some of our envisioned features fully, the process has equipped us with a solid plan of action and the technical know-how to realize our goals.
What's Next for The Big Picture
We have big plans for making our platform even better. Our next steps involve using what we learn from our users to help pick the right AI model for their needs based on what they're trying to do. We will make it easier for everyone to get just the right help from AI by ensuring the questions they ask are a perfect match for the model they use. This means better answers and less confusion. We're also tackling a common AI problem where it sometimes gets things wrong or makes stuff up, known as 'hallucination.' We plan to fix this by having different AI models check each other's work. This way, we can trust the answers and ensure they're based on real information. It's all about making AI more reliable and useful for everyone.
Built With
- amazon-web-services
- bedrock
- bootstrap
- claude
- lambda
- llama
- titan
Log in or sign up for Devpost to join the conversation.