Inspiration
The inspiration for ReScene came from a clear gap in AI photography today. On one end, basic filter apps are easy to use but lack real creativity. On the other end, professional AI tools require complex prompt engineering, which most users find intimidating.
We asked a simple question: what if everyone had a world-class Photography Director in their pocket?
With ReScene, you simply upload a photo. The AI Agent analyzes the scene to identify the location, and recommending the season, weather, or time of day that would make it look its best. If you want something different, you can just chat with it.
What it does
ReScene is an AI Photography Director Agent. It understands the scene, imagines the best version of it, proposes creative options, and executes the transformation. Here's how it works:
• Upload a photo: ReScene analyzes the scene and identifies the location.
• Discover the best moment: it recommends the season, weather, or lighting that would make the photo look its best.
• Transform instantly: generate a cinematic version of the scene with one tap.
• Refine by chatting: describe the vibe you want, and the Agent will adjust the scene for you.
How we built it
We architected ReScene using a "Left Brain / Right Brain" methodology:
- The Frontend (iOS/SwiftUI): Built natively with Swift 5.10 and iOS 17's @Observable macro. We implemented immersive UI components like .ultraThinMaterial chat bubbles, dynamic loading states, and a custom GeometryReader-powered Before/After slider.
- The Backend (Node.js/Fastify): A stateless, heavily decoupled serverless architecture deployed on AWS App Runner. We utilized the Dependency Injection (DI) pattern to easily swap AI service implementations.
- The Left Brain (Amazon Nova Lite): Handles the cognitive load. It powers our /api/chat endpoint, maintaining conversational context and utilizing strict Tool Use (Function Calling) via Amazon Bedrock to intelligently decide when to chat and when to output a structured JSON rendering blueprint.
- The Right Brain (Amazon Titan Image Generator v2): Handles the pixels. It receives the highly technical, Agent-crafted prompt via our /api/render endpoint to perform high-fidelity image-to-image inpainting and outpainting.
Challenges we ran into
Our biggest hurdle was Serverless State Management and Memory Constraints (OOM). Initially, we passed Base64 encoded images directly in the JSON payloads between the iOS client, our App Runner backend, and the Amazon Bedrock API. The mathematical reality of Base64 encoding dictates that the encoded size S_{base64} relative to the original binary size S_{binary} is:
$$S_{base64} = 4 \times \left\lceil \frac{S_{binary}}{3} \right\rceil$$
This roughly 33% payload bloat caused massive network latency and risked crashing our stateless App Runner instances during concurrent requests.
The Fix: We completely re-engineered the flow using Amazon S3. The app now uploads the binary image to a temporary S3 bucket and passes a lightweight s3:// URI (or object key) to the models. This reduced our payload size to O(1), unlocking blazing-fast multi-turn Agent conversations without re-uploading the image. We also implemented a 1-day lifecycle expiration rule in S3 for zero-maintenance garbage collection.
Accomplishments that we're proud of
- The "Dual-Mode" Agent Routing: Successfully coercing Amazon Nova to dynamically switch between conversational text and structured JSON "Proposal Cards" using prompt engineering and Bedrock's Tool Use capabilities.
- Zero-Friction UX: The native SwiftUI Before/After slider combined with tactile haptic feedback makes the "Aha!" moment of seeing the AI render incredibly satisfying.
- Production-Ready Architecture: Building a real CI/CD pipeline with GitHub and AWS App Runner, backed by a robust, stateless backend.
What we learned
- Prompt Engineering is a Backend Skill: Writing a prompt for a user is easy; writing a meta-prompt for an Agent to write a prompt for another model is a complex engineering challenge.
- The Power of Serverless + Storage: We learned how to elegantly bypass the limitations of stateless compute by leveraging cloud storage buckets for inter-API data handoffs.
- Structured Outputs are Mandatory: Relying on LLMs to output raw text for app logic is a recipe for crashes. Forcing JSON schemas (via Tool Use) is the only way to build reliable AI pipelines.
What's next for ReScene
We plan to fully integrate Amazon Transcribe combined with Amazon Nova's capabilities so users can literally talk to their AI Director using their voice while pointing their camera at a scene, generating real-time environmental remastering proposals on the fly.
Built With
- amazon-web-services
- bedrock
- ios
- nova
- swift
- swiftui
Log in or sign up for Devpost to join the conversation.