AxiomVision

pipeline flow

In high-stakes environments like motorsports and advanced manufacturing, a millimeter of change can decide victory or catastrophic failure. Current vision systems are blind to context, they see pixels, not problems. They can't tell a harmless shadow from a critical crack. AxiomVision is an intelligent visual difference engine that doesn't just see changes but understands them. By fusing computer vision with a causal inference engine, Axiom classifies changes, predicts their impact, and prescribes actions in real-time. We're not building another camera system; we're building the a visual system that reasons like your best engineer, diagnosing root causes from pixels alone.

Inspiration

We were inspired by the immense gap between human visual reasoning and automated systems. A manufacturing engineer doesn't just see a "scratch"; they see a "scratch on a load-bearing component that will fail under stress in 48 hours." A Formula 1 strategist doesn't just see "damage"; they see "a 0.3s/lap performance loss and increased tire degradation." We asked: what if a machine could reason the same way? This led us to build AxiomVision, a system designed to bridge the semantic gap between pixel-level changes and their real-world consequences.

What it does

Axiom Vision is a general-purpose platform for intelligent visual monitoring. Its core innovation is moving from change detection to change understanding.

Semantic Change Detection: It detects what changed (object, texture, position) and classifies it (e.g., "crack," "corrosion," "misalignment").
Impact Inference: Using a configurable knowledge graph, it infers the severity. A "scratch" on a cosmetic panel is low priority; the same scratch on a carbon fiber chassis is critical.
Causal Analysis & Prediction: It correlates visual changes with time-series data to suggest root causes and predict future failures.
Prescriptive Alerts: It doesn't just alert you; it suggests next steps: "Alert: Crack propagating. Correlated with vibration spike. Recommend: Schedule non-destructive testing within 24 hours." In an F1 context, Axiom Vision would detect a minor crack in a front wing endplate, cross-reference it with live telemetry showing increased aerodynamic flutter, and predict the component's remaining useful life, recommending a pit stop strategy.

Pipeline

Stage 1: Image Ingestion A flexible input system that accepts images from any source - industrial cameras, mobile uploads, or existing monitoring systems. Images are processed through a FastAPI backend with WebSocket support for real-time streaming.

Stage 2: Smart Change Detection Using Neural Networks, we can compare incoming images against baseline "golden samples." Unlike simple pixel comparison, the model will understand semantic changes i.e. can distinguish between a critical crack and a harmless shadow.

Stage 3: Intelligent Classification Our fine-tuned model will classify detected changes into categories: critical_crack, cosmetic_scratch, missing_component, corrosion, etc.

Stage 4: Context-Aware Reasoning We built a knowledge graph that understands domain-specific relationships:

if defect == "front_wing_crack" and location == "primary_structure":
    severity = "CRITICAL"
    impact = "aerodynamic_failure"
    action = "immediate_pit_stop"

Stage 5: Prescriptive Alerting Instead of just saying "defect found," we generate actionable insights: "Critical crack detected in suspension - Replace within 50km" "Minor scratch on bodywork - Monitor during next pit stop" "All components within tolerance - Continue operation"

Technical Stack

Frontend: React.js with Tailwind CSS for real-time visualization
Backend: Python FastAPI for high-performance inference
Computer Vision: OpenCV + PyTorch with pre-trained models
AI/ML: Siamese Networks for change detection, ResNet-50 fine-tuned for defect classification
Data Pipeline: Redis for real-time data streaming, PostgreSQL for knowledge graph storage
Deployment: Docker containers

Challenges and their Mitigation

The Semantic Gap

Challenge: The core AI challenge is moving from low-level pixel changes to high-level semantic understanding. Teaching a model that a specific pattern of pixels constitutes a "critical crack" versus a "superficial scratch" requires vast, precisely labeled data that is often scarce in real-world industrial settings. Mitigation: We will employ a hybrid approach. We'll use synthetic data generation to create thousands of realistic, labeled defects. We will also leverage transfer learning, starting with models pre-trained on large general image datasets and fine-tuning them with our smaller, high-quality domain-specific dataset. This reduces the data requirement while maintaining high accuracy.

Building and maintaining the knowledge graph

Challenge: Manually encoding all the complex relationships and failure modes for every possible domain (F1, aerospace, manufacturing) is an immense, expert-driven task that doesn't scale. Mitigation: We will not build a monolithic knowledge graph. Instead, we will develop a user-friendly, template-driven interface that allows domain experts (e.g., a mechanical engineer) to define their own components, relationships, and failure modes without writing code. For the hackathon, we will demonstrate this with a pre-built graph, but the architecture will be designed for user customization.

Environmental noise

Challenge: The real world is messy. AxiomVision must be robust enough to ignore irrelevant changes like moving personnel, shifting shadows, dust on the lens, or reflections, while still catching the subtle, critical defects. Mitigation: Train the models with a diverse dataset that includes these "distractors" as negative examples and implement a confidence-based alerting system. Low-confidence detections will be flagged for human review, and this feedback will be used to continuously re-train and improve the model, creating a self-learning loop that gets smarter over time.

How AxiomVision is different

Current solutions fall into two categories, both of which Axiom Vision fundamentally transcends.

Traditional automated optical inspection (AOI) systems - They are rigid, rule-based systems programmed to look for specific, pre-defined flaws in a highly controlled environment. They are powerful for repetitive tasks but fail completely with novel defects or in changing conditions. They answer "Is there a deviation from the golden sample?" but cannot answer "Is this deviation important?" or "Why did it happen?"
Modern AI-powered visual inspection tools - These use machine learning to be more adaptable than AOI systems and can learn to recognize new defects. However, they primarily function as sophisticated pattern matchers. They can tell you what is in an image but lack the reasoning to understand the so what.

Axiom Vision's core differentiation is its reasoning layer. We don't just stop at classification. We add a causal, knowledge-driven engine that understands the functional context of what it's seeing. This allows Axiom Vision to:

Prioritize, not just categorize. It knows that a "crack" on a decorative trim is a low-severity issue, while the same "crack" on a suspension mount is a critical failure.
Prescribe, not just alert. It moves beyond simple pass/fail notifications to provide actionable insights, such as "Component degradation detected. Predicted Remaining Useful Life: 72 hours. Recommend scheduling maintenance before next high-load cycle." In essence, while existing solutions provide data, Axiom Vision delivers actionable wisdom

What we learned

"General-Purpose" is a Spectrum: A truly general-purpose AI is a myth, but a highly adaptable platform is achievable. Our knowledge graph is the key to this adaptability.
The 80/20 Rule of AI: 80% of the value often comes from 20% of the complexity. A simple rule-based reasoner on top of a robust classifier can deliver most of the promised intelligence without needing a full causal AI model.