Inspiration

The spark came from a video by Underscore_ featuring Renaud Heitz (CTO of Exotec), detailing the orchestration of hundreds of robots in 24-hour delivery warehouses. As a 5th-year student (UQAC/ESGI) new to robotics, I dove into the academic side and noticed a glaring gap: every benchmark assumed a "perfect world." No mechanical failures, no sensor drops, no deadlocks. While most researchers were racing for the fastest millisecond, I realized that in the real world, a single "dead" robot can cause a massive cascade propagation that brings a multi-million dollar facility to a halt. MAFIS was built to be the bridge between theoretical "perfect" pathfinding and the messy reality of warehouse logistics.

What it does

MAFIS is a Fault Resilience Observatory for lifelong Multi-Agent Pathfinding (MAPF). It allows researchers to inject faults into robot fleets, simulating stuck or delayed agents – to observe how failures propagate through a system. It pairs every faulted run with a deterministic, fault-free baseline, turning every metric (throughput, delay, heat) into a measure of deviation. It features a real-time 3D simulation with a dashboard that runs entirely in the browser, providing a research tool environment for testing how robotics algorithms survive when things go wrong.

How we built it

  • Engine: Powered by Rust 2024 and the Bevy ECS (Entity Component System) for high-concurrency simulation and clean data-driven architecture. Compiled to WebAssembly (WASM) for a zero-install, 120+ FPS experience in the browser, while maintaining a headless CLI for high-speed statistical "Monte Carlo" experiments. (note: parallel computation and multi-threading is not fully optimized on web: for maximum performance we should run the desktop version but the web version is more polished for user experience)

  • Deterministic Timeline: We built a custom state-management system that allows for "Time-Travel" rewinding. You can rewind the simulation to a specific tick and re-watch a failure event without breaking the internal states of the solver, scheduler, or metrics.

  • SOTA Solvers: Ported and implemented complex Multi-Agent algorithms (PIBT, RHCR, Token Passing) from C++ research papers into idiomatic, safety-guaranteed Rust.

Challenges we ran into

The "Mathematical Wall" of state-of-the-art MAPF algorithms was significant, translating abstract paper logic into high-performance Rust required a deep dive into graph theory and priority inheritance. Technically, the Timeline/Rewind feature was the hardest puzzle. Ensuring that the solver's internal search tree, the UI's reactive state, and the cumulative metrics remained perfectly synchronized during a rewind without leaking state or causing non-determinism required a rigorous approach to data ownership and snapshotting.

Accomplishments that we're proud of

I'm incredibly proud of the Real-Time Metric Synchronization. There was a "Eureka" moment when, after weeks of debugging, the UI and the simulation engine finally spoke the same language - seeing the live charts dip and recover in perfect sync with the 3D robots navigating a "traffic jam" was amazing. Additionally, achieving 120+ FPS in WASM with 200+ active agents proves that high-fidelity research tools can be accessible and performant in a web browser.

What we learned

This project completely shifted my perspective from "Optimal" to "Robust." I learned that Chaos Engineering is dangerously underestimated in robotics. In a world where we ship software faster than ever, "Optimal" is a fragile goal - "Reliable" is what actually keeps the lights on. Building this from first principles as a newcomer taught me that curiosity and the right tools (like Rust and ECS) can bridge the gap between "having no background" and "contributing to the field."

What's next for MAFIS - Multi-Agent Fault Injection Simulator

  • Heterogeneous agents: Simulating different size of robots with different specs.
  • Machine learning: Predict data under faults to each scenario to observe a pattern and maybe find a formal model to explain the why.
  • Rescue Robot: Add a rescue robot that will take the dead robot and place it to a safe zone. Then we compare if it gives better throughput.

Links

GitHub

Website

Credits / Acknowledgments

Thanks to these communities for their incredible work and contributions :

Rust

Bevy - Game engine in Rust

Astro Framework

Thanks to researchers for their incredible work :

Special thanks to professor Okumura, pillar of the SOTA in the MAPF world who says that my project looks cool :>

Built With

  • astro-framework
  • bevy
  • rust
  • webassembly
Share this project:

Updates

posted an update

Website not up to date

The tool I developed is evolving way faster than what I thought, I have so much ideas and reliability checks that my website is not up to date with the actual simulator tool. So if you see some differences between a docs in the website and the tool, I'm aware, I'll update it later ! :>

Thanks for the people that are using my tool, it would be motivating if you give me a star in my repo!

Best regards, Teddy

Log in or sign up for Devpost to join the conversation.