Inspiration

Insurance fraud costs the industry billions of dollars annually, and the rapid advancement of generative AI has only poured gasoline on the fire. Today, bad actors can generate pixel-perfect photos of car accidents, doctored medical bills, or artificially aged property damage in seconds. Traditional fraud detection systems which blindly rely on metadata or basic rule-based checks are completely blind to these deepfakes. We realized that combating AI-generated fraud requires a modern approach: fighting AI with AI. We were inspired to build a tool that didn't just look at the surface, but mathematically peeled back the layers of a document to reveal its true origins.

What it does

ClaimCrane is a comprehensive, AI-powered insurance fraud detection platform designed to scrutinize evidence submitted for claims across five major verticals: Auto, Housing, Health, Life, and Policy.

When an adjuster uploads an image or document, ClaimCrane runs it through a robust, dual-layer forensic pipeline:

The CV Forensics Layer: This layer uses traditional computer vision techniques to analyze the image's "digital DNA". It performs Error Level Analysis (ELA) to find hidden compression artifacts, Fast Fourier Transform (FFT) analysis to check for unnatural frequency spectrums, Noise Pattern analysis to verify organic camera sensor noise, JPEG grid consistency checks to catch splicing, and deep EXIF metadata inspection. The AI Vision Layer: We pass the image directly into Google's Gemma 3 27B vision model. The model is specifically prompted to independently verify camera authenticity, detect subtle AI generation artifacts, and look for signs of physical staging. Our Scoring Engine then aggregates these signals, applying confidence-weighted adjustments and executing a "Majority Vote" across the layers to output a clear, actionable Risk Score (0-100) and Verdict (Authentic, Suspicious, or Fraudulent). ClaimCrane even supports a Batch Claim Mode, which analyzes multiple photos from a single claim simultaneously to detect cross-image anomalies, such as differing camera models or ELA outliers in a single batch.

How we built it

We built ClaimCrane to be fast, scalable, and highly analytical. Backend: We developed a blazing-fast Python backend using FastAPI to handle concurrent image processing. The CV Forensics pipeline is built entirely from scratch using lightweight libraries like Pillow, NumPy, and piexif to ensure deep pixel-level analysis without unnecessary overhead. AI Integration: For our AI layer, we integrated the Gemini API, specifically leveraging the multimodal capabilities of Gemma 3 27B to act as an expert forensic analyst. Frontend: The dashboard was crafted using React and Vite, brought to life with dynamic animations via Framer Motion. We designed the UI to present highly technical forensic data (like ELA Heatmaps and FFT scores) in an intuitive, beautifully visualized layer-breakdown that adjusters can understand at a glance. Infrastructure: The frontend is seamlessly deployed on Vercel, while the Python backend scales efficiently on render/Railway environments.

Challenges we ran into

One of our biggest hurdles was score normalization. In our early iterations, we ran into an issue where the AI model's categorical risk flags were heavily skewing the overall consensus due to a misaligned weight formula, causing the AI layer to default to a 100/100 risk score on minor infractions. We had to completely redesign our internal scoring engine to separate the weight of a forensic signal from the sub-score of a layer.

Another major challenge was handling PDFs. Insurers receive a massive mix of JPEGs, PNGs, and PDFs (like medical bills). We had to build a robust preprocessing layer that detects the MIME type, safely converts PDFs into high-resolution images, and intelligently decides whether to run camera-specific forensics (like sensor noise) versus document-specific text analysis.

Accomplishments that we're proud of

We are incredibly proud of our Batch Claim Mode architecture. Instead of just looking at images in a vacuum, our system can ingest up to 10 images at once and cross-reference them. Building the logic to flag a claim because one photo has a statistically deviant ELA mean compared to the rest of the batch or realizing that 4 photos were taken on an iPhone and the 5th was uploaded from a desktop browser is a massive leap forward for automated fraud detection.

We're also proud of our custom Majority Vote Scoring Engine, which beautifully handles conflicts. If the CV layer and the AI layer disagree by more than 40 points, the system intuitively flags a "Layer Conflict" and pauses the automated workflow to mandate human review.

What we learned

We learned that AI vision models and traditional computer vision algorithms are much stronger together than they are apart. Gemma 3 27B is brilliant at identifying unnatural lighting and inconsistent text rendering, but it can miss subtle metadata stripping or microscopic JPEG grid inconsistencies. Conversely, traditional CV math is rigid and can be thrown off by poor lighting, but it never lies about compression history. Melding these two philosophies together to create a unified consensus taught us the true value of layered defense in cybersecurity and fraud prevention.

What's next for ClaimCrane

In the short term, we want to expand our vertical-specific rules engines, for example, integrating medical billing code validation directly into the Health vertical so the AI can cross-reference the text on a bill with standard logic.

In the long term, we aim to implement temporal tracking: hashing and saving the digital signatures of known fraudulent images across our platform, so if a bad actor attempts to reuse a doctored roof-damage photo across three different insurance carriers, ClaimCrane will instantly flag the collision globally.

Built With

Share this project:

Updates