Adoptify

Home Screen
Features
Video generation via Gemini or OpenAI
Sign In/Sign Up using Auth0
Campaign Video generated

Inspiration

The heartbreaking reality that 920,000 shelter animals are euthanized annually not because they aren't lovable, but because their stories were never told. We witnessed shelters spending 15+ hours weekly on social media with minimal engagement while amazing pets got overlooked. When we learned that 70% of shelters report marketing as their biggest challenge, we knew AI could bridge this gap and save lives through better storytelling.

What it does

Adoptify automates the end-to-end creation of social media–ready adoption campaigns by turning pet photos into personalized, narrated videos—each told from the pet’s own point of view.

Input: Users upload a pet’s photo, name, and a short bio through the web interface or CLI.
Computer Vision Analysis: Analyzes the pet’s photo to infer traits, emotions, and personality cues (e.g., “playful,” “gentle,” “adventurous”).
Story Generation: Uses advanced language models (e.g., OpenAI GPT-4o-mini and Gemini) to craft a warm, engaging adoption story from the pet’s perspective.
Video Synthesis: Employs a generative video model (such as OpenAI Sora) to create a short, vertical MP4 video that visually and narratively represents the pet’s story.
Voice Narration: Adds a natural-sounding, AI-generated voiceover to bring the pet’s story to life.
Content Optimization: Generates platform-specific captions optimized for Instagram, TikTok, and Facebook. Produces SEO-friendly “Smart Hashtags” to boost discoverability and engagement. Ensures vertical video formatting and aspect ratio compliance for social media platforms.
Output: The web UI displays a preview of the generated video, captions, and hashtags. Users can download or share the completed adoption campaign instantly.

How we built it

We built Adoptify as a full-stack AI application designed for rapid prototyping, scalability, and creative experimentation all within a weekend-friendly architecture. The system seamlessly integrates multiple AI services for computer vision, storytelling, voice synthesis, and video generation, unified through a modern web stack.

Architecture Overview

Adoptify consists of two main layers: a React frontend for user interaction and an Express.js backend that orchestrates AI workflows.

Frontend (React + Vite + Tailwind):
Built with React, Vite, and TypeScript for speed and modularity.
Uses Tailwind CSS for utility-first styling and shadcn/ui components for clean, accessible UI primitives.
Core user experience lives in Composer.tsx, where shelter staff upload a photo, enter the pet’s name and bio, and preview the generated campaign.
@tanstack/react-query manages asynchronous state (AI polling, API responses), while sonner provides non-blocking toast notifications for UX feedback.
Fully responsive layout optimized for desktop and mobile previewing of vertical video content.
Backend (Express.js + Node.js):
A lightweight Express server (index.mjs) acts as a smart proxy between the frontend and multiple AI services.
Endpoints: POST /api/generate-video – Submits pet data to the OpenAI Video API (default: sora-2), triggers story generation with a text model (gpt-4o-mini), and polls until completion.

GET /api/video/:id/content – Streams the generated MP4 video directly to the browser.

Validation: Requests are validated with zod to ensure correct schema and data types.
Environment-driven config: Managed via .env for keys like OPENAI_API_KEY, OPENAI_VIDEO_MODEL, and OPENAI_TEXT_MODEL.

Challenges we ran into

Handling Asynchronous Video Generation & Streaming: Video synthesis through OpenAI’s Sora model is not instantaneous, generation can take up to a few minutes. The backend had to submit jobs, poll status endpoints, and stream the final MP4 back to the frontend without blocking. Managing polling intervals, timeouts, and user feedback (like “Your video is being created…”) was tricky to balance responsiveness with resource efficiency. We implemented a stream-safe proxy layer to handle large binary assets without memory spikes, ensuring smooth delivery to browsers.
Error Handling, Rate Limits & API Constraints: Integrating multiple third-party APIs brought a variety of operational challenges: Common issues included invalid API keys, billing limits, and organization verification errors (as seen in the README troubleshooting section). Some models occasionally returned malformed or incomplete outputs, so we added robust error recovery, fallback logic, and structured validation to ensure continuity. When Sora or text endpoints were temporarily unavailable, the app automatically switched to fallback content generation heuristics.
Keeping the User Experience Simple and Friendly: Because many shelters and rescues are run by volunteers, we needed to make the interface intuitive and non-technical. The UI had to clearly show progress states (“analyzing photo,” “generating story,” “rendering video”) without overwhelming users.

Accomplishments that we're proud of

We’re proud to have built a complete, production-grade pipeline connecting frontend, backend, and multiple AI systems seamlessly: Frontend → Backend → AI → Video → Stream → Download, all fully automated. The video generation, caption creation, and social pack export all work in one click.

Our technical achievements go beyond the surface: Structured social pack generation using JSON Schema enforcement and fallback heuristics to ensure consistent captions and hashtags.Efficient streaming proxy (GET /api/video/:id/content) for large video files memory-safe and optimized for Cloudflare delivery.

What we learned

External AI services can be slow, flaky, or inconsistent, so we learned to validate every response, never assume perfect output, parse defensively and handle partial or malformed data, provide fallbacks and graceful degradation instead of hard failures.

Using JSON Schema validation for AI-generated responses helped standardize captions and hashtags, but we also learned that even structured responses can fail. We built tolerant parsing and heuristic recovery systems to ensure that the app always returned useful content no matter what the model produced.

What's next for Adoptify

Short-Term:

Expanding beyond pet adoption to generalize Adoptify’s storytelling engine for broader marketing use cases—small businesses, nonprofits, and creators.
Richer Inputs: Support multiple images, short video clips, and brand prompts to craft more diverse, dynamic stories.
Voice & Language Options: Introduce customizable voiceovers (tone, gender, accent) and multilingual support for global campaigns.
Improved Reliability: Implement retry/backoff logic, real-time progress updates, detailed error reporting, and robust monitoring for higher platform stability.

Medium-Term:

Building automation and data-driven insights to deliver long-term value for creators, shelters, and organizations.
User Accounts & Campaign Management: Enable users to log in, manage, and revisit past campaigns in a unified dashboard.
Social Scheduling: Integrate direct publishing and scheduling for Instagram, TikTok, Facebook, and LinkedIn.
Analytics Dashboard: Visualize metrics like engagement, reach, and conversions, with AI-driven insights for optimization.
Content Moderation & Brand Safety: Add AI-powered filters for uploaded media and text to maintain tone consistency and appropriateness across campaigns.

Long-Term:

Evolving Adoptify into a full AI-powered storytelling platform with global reach and adaptive intelligence.
Multimodal Templating: Enable A/B testing, tone presets, and customizable brand templates for scalable storytelling.
Predictive Marketing AI: Develop models that anticipate audience preferences and automatically tailor visuals, captions, and delivery timing for maximum engagement.
CRM & Marketing Integrations: Seamless sync with major CRMs, ad managers, and marketing automation tools for one-click campaign deployment.
Global Storytelling Ecosystem: Create a community of shelters, brands, creators, and advocates sharing impactful stories—proving that emotional storytelling can drive meaningful change at scale.