PoseShift AI

home
home page
Generation Effect

Inspiration

When we go out to take pictures, we often end up with unsatisfactory photos for various reasons. The poses might not be natural enough, or we might blink, or the camera angle and overall effect of the photos might be far from ideal.

So I started thinking: could we somehow improve our photos and capture better poses, so that we don't waste a trip because of bad photos and fail to properly preserve our precious memories?

Furthermore, in this project, we have always insisted on maintaining a sense of authenticity. We are not creating a face-swapping tool, because face-swapping creates images that lack a genuine connection to the person. Therefore, we maintain the original user's scene, focusing on their real face and clothing. Simply replacing a face with that of a beautiful model creates a very different feeling; even if the resulting image looks good, it's not truly you.

As a solo developer with no formal programming background, I wanted to build something that combines AI creativity with production-grade reliability. The challenge of monitoring LLM applications in serverless environments inspired me to explore Datadog's observability capabilities.

What it does

PoseShift transforms any photo to match a pose reference using Google Gemini's multimodal AI capabilities:

Upload your photo + a pose template
AI analyzes the pose structure (using Gemini Flash 2.5)
AI generates a new image with your likeness in the target pose (using Gemini Pro 3.0)

How we built it

AI Pipeline:

Gemini 2.5 Flash: Analyzes pose images and generates structured JSON keypoints
Gemini 3.0 Pro: Generates the final transformed image

Backend:

Firebase Cloud Functions (Node.js 20, Serverless)
Secure API key management via Firebase Secret Manager

Observability:

dd-trace: Full APM tracing with custom LLM spans
Direct API fallback: Ensures metrics are sent even in serverless cold-start scenarios
Dual-channel reporting: Combines dd-trace spans with direct HTTP API calls

Observability Strategy

We implemented a comprehensive monitoring approach:

APM Tracing: Parent span for the entire generation + child spans for each AI step
Custom LLM Metrics: Latency, success rate, error tracking
Detection Rules: 3 monitors for error rate, latency spikes, and availability
Incident Management: Automated actionable items when rules trigger

Challenges we ran into

Serverless Flush Constraints: Firebase Cloud Functions freeze execution immediately after return. We solved this by implementing a dual-channel approach - using dd-trace for detailed spans AND direct API calls for critical metrics.
Cold Start Reliability: Ensuring Datadog tracer initializes before any other imports was crucial for consistent tracing.
LLM Observability in Production: Correlating user requests across two separate AI model calls required careful span parenting and trace ID propagation.

Accomplishments that we're proud of

Built a production-grade LLM application with zero prior programming experience
Achieved end-to-end observability in a serverless environment
Created a working AI product that real users can experience

What we learned

I have to say, I'm incredibly proud of completing this entire project for the first time. This is especially true because I had absolutely no prior experience with both front-end and back-end development.

I used Gemini's code generation capabilities to create this project, and its capabilities are truly impressive. Furthermore, by combining it with Datadog, I realized that even with the advent of AI, there are still excellent tools available for monitoring and improving the overall service quality.

Key learnings: