StickerSmith: A Pixel Art Generator Fusing Faces and Logos
Overview
StickerSmith combines two distinct visual inputs to generate a unified pixel-art avatar:
- Face: A selfie or a character portrait
- Symbol: A company logo, a team crest, or an abstract icon
By leveraging the multimodal capabilities of Gemini 3 Pro Image (Nano Banana Pro), the tool fuses these concepts into a single entity. The output is a high-resolution sprite sheet arranged in a perfect grid, featuring 16 unique poses and expressions of a newly generated pixel avatar.
Technical Implementation
StickerSmith was built using a modern, lightweight frontend stack powered by Google’s latest generative models.
- Frontend: React (TypeScript) with Vite for a fast development cycle
- Styling: Tailwind CSS to achieve a clean, dark-mode "developer-aesthetic" UI
- AI Engine: Integrated via the
@google/genaiSDK to communicate with the Gemini API - Model:
gemini-3-pro-image-preview (Nano Banana Pro)
While Flash models are faster, the Pro model was essential for capturing the nuance of fusion—blending the color palette and shape language of a logo with a face without losing the identity of either.
Challenges Encountered
Prompt Adherence
Early iterations often produced:
- A single large image instead of a grid
- Randomly scattered sprites
We refined prompt engineering to strictly enforce the "4 rows × 4 columns" constraint.
API Key Management
To allow users to bring their own paid keys for the Pro model, we implemented a secure window.aistudio key selection flow. This required careful UX design to ensure a seamless and secure experience.
Key Takeaways
Gemini 3 Pro’s Instruction Following
The difference between Flash and Pro models in adhering to complex spatial instructions (e.g., generating a 4×4 grid) is significant. The Pro model demonstrates a far superior understanding of layout.
Prompting for Sprite Sheets
Using the term "Sprite Sheet" acts as a trigger for the model, ensuring:
- Consistent character design
- Structured multi-variation output
The Joy of Fusion
AI excels at combinatorial creativity—finding a visual middle ground between two distinct concepts that would be difficult for humans to visualize instantly.
Built With
- nanobanana
- react
- typescript
Log in or sign up for Devpost to join the conversation.