About enel (https://github.com/akshayakula/enel)

enel is a rapid 3D scene digitization system built for critical operations. Our goal was to turn a small swarm of low-cost cameras, including Raspberry Pis mounted on drones, into a field-deployable pipeline for capturing real environments and converting them into 3D Gaussian Splatting (3DGS) assets and digital twins for simulation, planning, and analysis.

What inspired us

In national security and critical operations, teams often need an accurate understanding of a space before they can safely act in it. That usually means sending people into uncertain environments, relying on incomplete maps, or using expensive specialized hardware.

We wanted to explore a different model: what if a distributed swarm of lightweight cameras could quickly capture a scene, stream it back in real time, and feed a reconstruction pipeline that turns the physical world into a digital twin?

That idea led us to enel: a system for using networked Raspberry Pis and drones to collect synchronized video from multiple viewpoints, monitor the feeds live, and prepare them for downstream 3D reconstruction.

How we built it

We built enel as a multi-stage capture and reconstruction pipeline.

1. Multi-camera capture on Raspberry Pis

We used Raspberry Pi Zero 2 W boards with CSI cameras as lightweight video nodes. Each Pi captures video-only H.264 streams at roughly 1280x720 @ 30 FPS and publishes them over the local network to a central ingest server.

On the Pi side, we used:

  • rpicam-vid for camera capture
  • ffmpeg for stream packaging and transport
  • a small supervised publisher stack managed with systemd

This gave us a practical, low-cost camera node that could be placed around a room or attached to a drone.

2. Central media ingest and live monitoring

On the laptop, we ran MediaMTX as the media server. It handled:

  • multi-stream ingest
  • WebRTC-compatible playback
  • recording to disk
  • path management for cam1 through cam4

We then built a lightweight Node.js dashboard that displayed all feeds simultaneously in a 2x2 live viewer. The dashboard let us monitor the scene in real time while the system was capturing from multiple viewpoints.

In practice, the architecture was:

  • Raspberry Pis / drone-mounted nodes publish video to MediaMTX
  • the dashboard consumes the streams through WebRTC/WHEP-style playback
  • MediaMTX records the feeds for later processing into reconstruction datasets

3. Video-to-reconstruction preprocessing

We quickly learned that raw video is not automatically a good reconstruction dataset. The main bottleneck was not GPU training, but camera pose estimation and photogrammetry preprocessing.

To address that, we built a preprocessing stage that:

  • extracts frames from recorded videos
  • performs basic frame quality control
  • reduces redundancy
  • prepares a cleaner COLMAP-ready or reconstruction-ready dataset

This let us test both local and cloud workflows for downstream 3D generation.

4. Cloud reconstruction experiments

We explored two downstream paths:

  • a self-managed COLMAP + Gaussian Splatting workflow on cloud GPUs
  • a faster managed API workflow using KIRI Engine 3DGS

The self-managed approach gave us more control, but also highlighted how fragile classic reconstruction can be when the input is just multi-view video. The managed 3DGS API path showed a promising direction for faster operational use.

What we learned

We learned that the hardest part of this problem is not simply “train a 3D model.” The real challenge is building a capture system that produces reconstruction-friendly data under real-world conditions.

A few key lessons stood out:

  • COLMAP is often the bottleneck, not GPU training
  • multi-camera video is useful, but frame selection and quality control matter a lot
  • networking for low-latency multi-stream ingest is nontrivial
  • MediaMTX + WebRTC is a strong pattern for live monitoring and recording
  • low-cost edge hardware like Raspberry Pis can be surprisingly capable for distributed capture
  • for rough, fast outputs, managed 3DGS APIs may be more operationally practical than fully custom pipelines

We also learned a lot about designing systems for environments where bandwidth, hardware cost, and deployment simplicity matter just as much as reconstruction quality.

Challenges we faced

The biggest challenges were:

  • getting reliable multi-camera streaming over a local network
  • handling WebRTC/media-server configuration cleanly
  • deciding how to move from live feeds to reconstruction-ready assets
  • dealing with the fragility of classical photogrammetry pipelines
  • testing cloud GPU workflows that were sometimes unstable or expensive for quick iteration

Another challenge was balancing two very different needs:

  • live situational awareness now
  • high-quality 3D reconstruction later

We designed enel so that the same capture system can support both.

Why this matters

enel is not just a camera wall. It is a prototype for a broader idea: using distributed, low-cost sensing nodes and autonomous platforms like drones to create rapid digital twins of real environments.

For critical operations, that means teams could potentially:

  • capture a structure before entry
  • build a 3D representation for planning or simulation
  • preserve a scene for analysis
  • use commodity hardware instead of specialized scanning rigs

What’s next

Our next steps are:

  • improve synchronization across camera nodes
  • capture higher-quality stills or bursts specifically for splatting
  • automate cloud-side reconstruction
  • support better drone-swarm coordination
  • generate more reliable 3DGS outputs directly from multi-camera missions

enel started as a way to stream multiple Raspberry Pi feeds at once. It evolved into a system for turning those feeds into the foundation of a mission-relevant 3D digital twin pipeline.

Built With

  • mediamtx
Share this project:

Updates