Growing up in Romania, you couldn't miss the bear stories. They were on the news constantly — a hiker mauled on the Transfăgărășan, a bear rummaging through bins outside an apartment block in Brașov, tourists on the Transalpina pulling over to feed one through a car window like it was a roadside attraction. It felt normalized. It shouldn't be. Romania holds the largest brown bear population in Europe, and over the past decade that has translated into 26 fatalities and 274 recorded injuries. These aren't statistics from a distant wilderness — they're from roadsides, backyards, and city edges.

That discomfort stuck. And when we started looking beyond Romania, we realized the scale of the problem was much larger than we'd assumed. In the Yellowstone ecosystem, Alaska, and Montana, grizzly attacks have claimed 21 lives over the past 10 to 15 years. In Japan, the situation escalated dramatically — 13 deaths and 88 reported incidents in a single month in 2023 alone, with bears walking into cities and tearing through residential neighborhoods in search of food. The thread connecting all of these cases is the same: by the time anyone knew an animal was there, it was already too late.

That's where the idea came from. Not from a textbook or a project brief, but from watching a problem that felt solvable go unsolved for years.

What we built

Animal IDS is a real-time perimeter monitoring system that detects, classifies, tracks, and alerts on animal intrusions the moment they happen. A camera watches a defined area. The moment an animal enters the frame, the system identifies what it is, assigns it a persistent tracking ID, estimates how far away it is, maps its position on the perimeter, and fires an alert — all within seconds.

On the detection side, we used YOLOv8 for inference, which handles both detection and classification in a single pass and runs fast enough on consumer-grade GPU hardware to be genuinely deployable at the edge. ByteTrack sits on top of that and keeps a persistent identity on each animal across frames, so the system doesn't just see a bear — it sees bear number three, who has been in the north fence zone for eleven seconds and is moving east. ZoeDepth handles monocular distance estimation, meaning we get a real-world metre reading per animal without needing a stereo camera setup.

The backend is built on FastAPI with PostgreSQL for persistent event logging and Redis for live state — active tracks, positions, alert queues. Celery handles async dispatch so that when an alert fires, the email, SMS, and webhook all go out without blocking the inference pipeline. The frontend is React with Leaflet for the map layer, giving a live annotated feed, a scrollable event log, and a perimeter map with pins showing exactly where each animal is right now.

Everything runs in Docker, which means the same system can run on a Jetson device bolted to a fence post in rural Romania or on a cloud instance monitoring a national park remotely.

What we ran into

Depth estimation was the hardest part to get right. ZoeDepth produces good metric depth on clean outdoor scenes, but the moment lighting drops, fog comes in, or an animal is partially occluded by vegetation, the readings get noisy. We ended up sampling a central crop of each bounding box and taking the median rather than a single centre pixel, which smoothed things out considerably — but it's still a weak point in low-visibility conditions.

Calibrating the homography for perimeter mapping also required more care than expected. The math is straightforward but getting the ground-truth reference points measured accurately on an actual site, and keeping the transform stable as the camera shifts slightly over time, is a real engineering problem rather than a software one.

The alert cooldown logic was a subtle design challenge too. The pipeline runs at 25+ frames per second. Without throttling, a single bear in frame for thirty seconds would generate hundreds of database writes and notification sends. Getting the balance right between "fast enough to matter" and "not flooding the operator" took several iterations.

What's next

The system right now is reactive — it tells you something is there. The natural next step is making it predictive. We have the event log infrastructure in place, which means we're already collecting the data needed to build temporal heatmaps of where and when animals appear most frequently. With enough sightings, you can start modeling risk windows — flagging that a specific zone typically sees activity between dusk and midnight, or that a particular entry corridor has been used repeatedly over multiple weeks.

Beyond that, the architecture is intentionally modular. The CV engine, backend, and frontend are fully decoupled. Swapping the animal detection model for a human intrusion model, a fire detection model, or a vehicle model requires changing exactly one component. The alerting and mapping infrastructure stays the same. That was a deliberate choice, and it means the system has a much broader surface area of deployment than wildlife monitoring alone — agricultural fencing, conservation reserves, remote infrastructure protection.

We started this because bears on Romanian roadsides felt like a problem that shouldn't still be a problem in 2025. It turns out the same gap exists on every continent, and the tooling to close it has only recently become accessible enough to build outside of a well-funded research lab. That's the window we're in.

Share this project:

Updates