Inspiration
Infrastructure failures — cracked bridges, corroded pipelines, damaged roads — cost billions annually and put lives at risk. Traditional inspections are manual, slow, and inconsistent. We wanted to build a tool that lets a field engineer simply snap a photo and instantly get an AI-powered damage assessment with a professional PDF report, reducing inspection time from hours to seconds.
What it does
AI Field Inspector is a full-stack web application that automates infrastructure damage detection. A user uploads a photo of infrastructure (roads, walls, bridges, etc.), and the system:
- Classifies the image as cracked or uncracked using a custom-trained MobileNetV2 deep learning model (100% validation accuracy)
- Detects and localizes damage regions with bounding boxes, severity levels (Critical/High/Medium/Low), and confidence scores
- Generates a detailed inspection report using Google Gemini 2.0 Flash LLM with actionable recommendations
- Produces a downloadable PDF report with the image, findings, severity legend, model metadata, and processing time
How we built it
- Backend: FastAPI serving a REST API with endpoints for upload, detection, reporting, and full inspection pipelines
- ML Model: Trained a MobileNetV2 binary classifier on a dataset of 50 cracked and 50 uncracked infrastructure images using PyTorch, achieving 100% validation accuracy
- LLM Integration: Google Gemini 2.0 Flash generates structured, professional inspection reports from detection results with graceful fallback if the API is unavailable
- PDF Generation: fpdf2 creates polished PDF reports with embedded images, severity legends, and metadata
- Frontend: React 18 single-page application with drag-and-drop upload, real-time detection visualization with bounding box overlays, and a demo image button
Challenges we ran into
- PyTorch + Node.js on one server: Cloud platforms like Railway detect only one runtime. We solved this by creating a custom Dockerfile that installs both Python and Node.js, builds the React frontend at deploy time, and serves everything from FastAPI
- Gemini SDK deprecation: The google-generativeai SDK was deprecated mid-development; we migrated to the new google-genai SDK and updated all API calls PDF Unicode crashes: Special characters from LLM-generated reports caused fpdf2 to fail. We built a _sanitize() function to strip problematic characters
- File encoding issues: React source files created on Windows had BOM encoding that broke the build; we had to regenerate files with ASCII encoding
Accomplishments that we're proud of
- 100% validation accuracy on our crack classification model with a lightweight MobileNetV2 architecture
- End-to-end pipeline from image upload to AI-generated PDF report in under 5 seconds Single-service deployment — one Docker container runs the entire stack (React + FastAPI + PyTorch + Gemini)
- Graceful degradation — the system works even without YOLO weights, without the Gemini API key, and without GPU, falling back to simulation and template reports
- Professional PDF reports with severity legends, model metadata, and processing time metrics
What we learned
- How to train and deploy a custom image classifier using transfer learning with MobileNetV2
- Integrating multiple AI systems (CNN classifier + LLM report generation) into a cohesive pipeline 3.The complexity of deploying ML models to cloud platforms with memory and disk constraints 4.Building production-ready APIs with proper error handling, fallback mechanisms, and CORS configuration 5.Containerizing full-stack applications with both Python and Node.js runtimes
What's next for AI-Powered Infrastructure Inspection System
- Multi-class detection: Expand beyond crack detection to corrosion, water damage, structural deformation, and vegetation overgrowth
- YOLO object detection: Integrate YOLOv8 for precise bounding box localization instead of simulated regions
- Mobile app: Build a React Native companion app for field engineers to capture and inspect on-site
- Historical tracking: Add a database to track inspection history per site, enabling trend analysis and predictive maintenance
- Drone integration: Accept aerial/drone imagery for large-scale infrastructure like bridges and power lines
- Multi-language reports: Generate inspection reports in multiple languages for global deployment
Log in or sign up for Devpost to join the conversation.