Inspiration
Our project started at the hackathon itself. After watching one of the live demos showcasing the Game of Life on an LED matrix, we were reminded of a video of a fluid simulator running on an ESP32 driving an LED matrix, the effect was mesmerizing and seemed fun to play with. But we immediately started wondering whether the ESP32 was actually the right tool for the job, and decided to test out a physics-based application on an FPGA. Fluid simulation is an inherently parallel problem; every cell in the grid wants to update simultaneously. With the access to DE1-SoC and some access to a DE10-Lite FPGA board both with onboard accelerometers, we chose to translate the problem solvable for an FPGA. That instinct to question the underlying compute model, and to match the architecture to the problem rather than the other way around, is what drove the whole project.
What it does
Flow models fluid dynamics using a flow-based height-field automaton, where each of the 64 cells in an 8×8 grid stores an unsigned 16-bit water height value. On every physics tick, all 112 inter-cell edges, 56 vertical and 56 horizontal, simultaneously compute a desired flow based on the height difference between adjacent cells plus a directional gravity bias derived from the onboard ADXL345 accelerometer. Each flow is clamped to at most 25% of the source cell's height, ensuring water can never go negative and is conserved across the grid. These flows are registered in the first pipeline stage, then applied to all 64 cell heights simultaneously in the second, with a third stage pulsing an update-done signal, giving a total physics latency of just 3 clock cycles per step. The entire update is implemented as a parallel dataflow graph in Verilog using generate blocks, meaning all 112 flows and all 64 height updates resolve concurrently in hardware rather than sequentially. The LED matrix is driven directly from a combinational framebuffer that lights each pixel whenever its cell's height exceeds a fixed threshold, producing a real-time visual of the fluid surface.
How we built it
Building Flow was an iterative process that required rethinking both the physics and the hardware stack at nearly every stage. On the physics side, we initially implemented the FLIP (Fluid-Implicit-Particle) algorithm using CogniChip, but it quickly became clear that FLIP was the wrong fit, the algorithm is inherently sequential and the resource utilization was far beyond what our board could handle, so we redesigned around a cellular automaton model where every cell updates simultaneously, which maps almost perfectly onto FPGA hardware. On the accelerometer side, we started with the DE1-SoC provided at the hackathon, but its accelerometer communicates through I2C via an ARM HPS processor before reaching the FPGA fabric, a layer of complexity that proved infeasible to implement under the time constraint, so we switched to the DE10-Lite whose ADXL345 connects directly to the FPGA over SPI, a much cleaner interface. As the deadline approached, we pursued two parallel paths: a working demo on the DE1-SoC using hardcoded acceleration values, and a full implementation on the DE10-Lite with live accelerometer input. Throughout the entire process we compiled regularly in Quartus, treating each compilation as a checkpoint to catch timing and synthesis issues early and keep the codebase stable as the architecture evolved.
Challenges we ran into
One of our biggest challenges was the time cost of working with the FLIP-based implementation early on. Each compilation in Quartus took a significant amount of time, which made iteration slow and debugging painful, only to ultimately conclude that the algorithm wasn't suited for the hardware at all. In addition, our team is relatively new to FPGA development, so Quartus presented a steep learning curve. Error messages were often difficult to comprehend, and the gap between writing Verilog and understanding why the synthesizer was rejecting it took considerable time to bridge. Debugging on an FPGA is fundamentally different from software and signal visibility is limited without dedicated instrumentation. We also had to scale back our ambitions on the display side, where we originally intended to drive a 32×8 LED matrix for a much wider simulation grid, but hardware and timing constraints forced us to downscale to an 8×8 matrix to keep the project feasible within the hackathon timeframe. Working through these constraints as beginners meant that a lot of our time was spent building intuition for the toolchain rather than just implementing features, but that struggle ultimately gave us a much deeper appreciation for how hardware design differs from conventional programming.
Accomplishments that we're proud of
We're proud of building a real-time physics simulation that runs entirely in hardware, with all 64 cells updating in parallel every 3 clock cycles. What started as an ambitious idea quickly ran into walls, the FLIP algorithm was too slow, the DE1-SoC accelerometer stack was too complex, and Quartus was unforgiving. Working through each of those setbacks and arriving at a clean, working design is something we're genuinely proud of. The final cellular automaton engine is not a compromise, it's actually the more elegant and hardware-appropriate solution, and recognizing that mid-hackathon and having the conviction to restart was one of our best decisions.
What we learned
This hackathon gave us our first real exposure to digital logic fundamentals, how combinational and sequential logic differ, how pipelining moves data through registered stages, and how parallelism in hardware is expressed structurally rather than algorithmically. We learned to read, write, and compile Verilog, and to interpret Quartus synthesis errors well enough to fix them. More broadly, we came away understanding that FPGA development requires thinking about computation differently, not as a sequence of instructions but as a physical flow of data through logic. These fundamentals are the stepping stones to far more complex and exciting projects, and we're eager to keep building on them.
What's next for Flow
The most immediate next step is completing the full DE1-SoC implementation with live accelerometer input over I2C. Beyond that, we'd like to explore extending the neighborhood model from 4 neighbors to 8 by incorporating diagonal flows, which would make wave propagation more isotropic and visually realistic. On the display side, we're interested in adding PWM brightness control to the LED matrix so that cell height maps to brightness rather than a binary on/off threshold, giving the fluid a much richer visual depth. Further out, we'd like to scale the simulation beyond 8×8, finish the 32x8 grid we've been using, which would still be well within the parallel advantage of an FPGA while producing a far more impressive visual result. Finally, we're curious about exploring more sophisticated physics models now that we have the foundational hardware intuition, implementing surface tension, multiple fluid types, or even a hybrid particle-grid approach would be natural extensions of what we built here.
Built With
- cognichip
- fpga
- quartus
- verilog
Log in or sign up for Devpost to join the conversation.