Inspiration
In a world dominated by internet culture, memes have become one of the most effective tools of communication. But creating a timely, witty, and relevant meme still requires human effort and creativity. I wanted to explore what would happen if we gave that job entirely to AI, no human curation, just agents working together to create and publish humor in real-time. The Meme Machine was born out of this curiosity: could AI not only understand what's trending, but also turn that into something funny and safe to share?
What it does
The Meme Machine is a fully autonomous multi-agent system that:
- Reviews user-submitted idea to ensure meme is appropriate and does not contain unethical or offensive content.
- Searches relevent meme template from Imgflip.
- Uses Google's Gemini API to generate funny caption based on user-submitted meme idea.
- Composes the meme by combining the meme template and the generated caption.
- Publishes the meme on to Reddit.
How we built it
I used Google's Agent Development Kit (ADK) to build a multi-agent workflow where each agent has a specialized role. Each agent handles a single responsibility, and overall workflow is orchestrated by a combination of ParallelAgent and SequentialAgent that ties everything together. Each agent in the workflow has a specific function:
- Content Moderator Agent evaluates the user-submitted idea to ensure it is free of hate speech, violence, racism, blasphemy, or any other harmful content, confirming its suitability for meme creation.
- Template Scout Agent uses the all-MiniLM-L6-v2 model to generate an embedding of the idea and performs a similarity search via the Imgflip API to identify the most appropriate meme template.
- Caption Generator Agent interprets the user’s idea and crafts a short, humorous, and imaginative caption.
- Meme Composer Agent merges the caption with the selected template by determining an optimal placement on the image using image processing techniques.
- Meme Publisher Agent shares the finalized meme on Reddit.
The Template Scout and Caption Generator agents operate in parallel, whereas the Content Moderator, Parallel Agents, Meme Composer, and Meme Publisher function sequentially.
This multi-agent system was developed using:
- FastAPI for serving the multi-agent system.
- all-MiniLM-L6-v2 model for sentence embedding.
- OpenCV for image processing.
- praw for posting memes to Reddit.
Challenges we ran into
- ADK Orchestration: Coordinating state between agents in a sequential pipeline using ADK required careful state management.
- Content Moderation: Reviewing the user-submitted idea to ensure ethical use of AI for content generation.
- Deployment: Ensuring the service ran smoothly took several iterations.
Accomplishments that we're proud of
- Developed a fully working, end-to-end meme generation and publishing system from scratch.
- Successfully integrated five distinct AI agents into a unified pipeline.
- Learned and implemented Google's brand-new ADK framework in a real-world use case.
What we learned
- How to design modular, agent-based architecture using ADK.
- Best practices for API-based content generation and moderation.
- How to process and combine data in a creative application.
- Deployment strategies on cloud platforms.
What's next for The Meme Machine
- Instagram & Twitter Support: Extended publishing capabilities beyond Reddit.
- Fine-tuned humor model: Incorporate reinforcement learning from human feedback to mimic human-like humor.
- Content Generation & Management for Content Creators: Develop an integrated platform that assists content creators by suggesting fresh content ideas, generating structured roadmaps for content production, whether based on creator input or brand-specific promotional requirements, automatically creating the content, providing clear guidance for capturing photos or videos when needed, assembling the final content, and managing publication according to a set schedule.
Log in or sign up for Devpost to join the conversation.