Inspiration This project was inspired by the growing need for high-quality image enhancement tools that are accessible and user-friendly, similar to how media servers like Jellyfin revolutionized video management. We wanted to create a desktop application that combines powerful AI upscaling technology with an intuitive interface, allowing users to easily improve their image collections without complex command-line tools.
What it does Pixel Perfect is a cross-platform desktop application that provides AI-powered image upscaling and fine-tuning capabilities. Users can upload images through a Jellyfin-inspired dark UI, choose between general upscaling using RealESRGAN or custom fine-tuned models, and even train their own chroma refinement models on personal image datasets. The app includes features like automatic folder watching for batch processing, user authentication, and seamless integration with a Dockerized FastAPI backend.
How we built it The project consists of three main components:
Desktop App: Built with Python using CustomTkinter for a modern, dark-themed UI that mimics Jellyfin's aesthetic. It handles user authentication (SQLite database), file uploads, and communicates with the backend via HTTP requests.
Backend API: A FastAPI application containerized with Docker, featuring PyTorch-based AI models. It includes RealESRGAN for 4x upscaling and a custom ChromaRefiner CNN that enhances color accuracy by refining chroma channels in YUV color space.
Fine-tuning Pipeline: A PyTorch training script that allows users to fine-tune the chroma refinement model on their own images, improving results for specific domains or styles.
The system uses tiled inference to handle large images efficiently, supports both CPU and GPU processing, and is designed to run cross-platform on macOS, Windows, and Linux.
Challenges we ran into Memory Management: Implementing tiled inference was crucial to prevent out-of-memory errors when processing large images, requiring careful overlap handling and CUDA cache management.
Cross-Platform Compatibility: Ensuring the desktop app worked consistently across macOS (Apple Silicon), Windows, and Linux, particularly with Docker integration and file system operations.
Model Integration: Combining RealESRGAN with our custom chroma refinement model required careful preprocessing and color space conversions to maintain image quality.
GPU Support: Setting up optional Nvidia GPU acceleration in Docker while maintaining CPU-only fallback compatibility.
Accomplishments that we're proud of Seamless User Experience: Created a polished desktop app with professional UI/UX that makes AI image processing accessible to non-technical users.
Flexible AI Pipeline: Successfully integrated multiple upscaling approaches (general AI upscaling + domain-specific fine-tuning) in a single application.
Production-Ready Architecture: Built a scalable FastAPI backend with proper error handling, CORS support, and containerization for easy deployment.
Cross-Platform Success: Achieved full compatibility across major desktop platforms with both CPU and GPU support options.
What we learned AI Model Deployment: Gained deep experience in deploying PyTorch models in production environments, including memory optimization and inference acceleration techniques.
Full-Stack Development: Learned to build end-to-end applications combining desktop UI development, REST APIs, and machine learning pipelines.
Containerization Best Practices: Mastered Docker Compose for complex multi-service applications with GPU support and cross-platform compatibility.
User-Centric Design: Discovered the importance of intuitive interfaces for technical tools, and how professional UI design can make advanced AI capabilities accessible to broader audiences.
Log in or sign up for Devpost to join the conversation.