Inspiration
The journey of creating an innovative solution for the Automated Prompt Engineering Challenge, part of Hackathon Challenge 4, was both exhilarating and challenging. Our team, inspired by the burgeoning field of AI and the power of large language models (LLMs), set out to develop a tool that could automatically generate effective prompts for LLMs, streamlining the process of leveraging these powerful models in various applications.
What it does
The task was daunting: develop a genetic algorithm capable of generating prompts for an LLM that would yield desired outputs. We chose a pre-trained open-source LLM (GPT-2) for our task and set about defining the components of our genetic algorithm. This included determining the structure of chromosomes (prompts) and genes (components of a prompt), creating a fitness function to evaluate the effectiveness of prompts, and designing crossover and mutation algorithms to optimise our solutions.
How we built it
We started by selecting GPT-2 as our LLM of choice, owing to its versatility and wide range of capabilities. Our next step was to define the genetic algorithm's structure. This involved:
Chromosome and Gene Definition: We conceptualized chromosomes as sequences of words or phrases that formed a complete prompt, with each word or phrase acting as a gene.
Fitness Function: We developed a function to evaluate how closely the LLM's output matched our desired output, which was a critical aspect of our tool.
Crossover and Mutation Algorithms: These algorithms were designed to introduce diversity and evolution into our prompt solutions, allowing the algorithm to explore a broader range of potential prompts.
Challenges we ran into
One of the major challenges we faced was in the development of an effective fitness function. This function needed to accurately assess the quality of the LLM's output in response to a given prompt, which is not a straightforward task. Training also required significant compute power, resulting in extensive loading times.
Another significant challenge was integrating our solution into HyperCycl's Computation Node software. This required not only technical expertise but also a deep understanding of HyperCycl's architecture and APIs.
Log in or sign up for Devpost to join the conversation.