INSPIRATION
─────────────────────────────────────────────────────────────────────────────
This problem is not unknown. It is ignored.
Delhi-NCR records over 14,000 cases of crimes against women every year — the highest of any metropolitan region in India. That number has been reported, analysed, condemned, and published annually for over a decade. Every time an incident emerges, the response follows the same pattern: investigation after the fact, statements from officials, calls for better infrastructure. The cameras that recorded the incident become evidence. They were never intelligence.
I kept returning to one question while building SafeTrace: why does the reasoning layer not exist between the camera signal and the response? The cameras are already deployed across Delhi-NCR — industrial zones, isolated stretches, parking complexes, transit hubs, markets, residential corridors. The data is already flowing. Elasticsearch is already capable of detecting the pattern in that data, correlating it geospatially, comparing it against ninety days of zone-specific history, and producing a dispatch recommendation with a written justification — in the time it takes a control room operator to notice an alert on a screen.
The gap is not hardware. It is not data volume. It is not compute. The gap is that nobody built the reasoning layer. SafeTrace is that layer. The inspiration was not an idea — it was a recognition that the components to build this have existed for years, Elasticsearch made them composable for the first time, and the cost of not building it is measured in incidents that should not have happened.
WHAT IT DOES
─────────────────────────────────────────────────────────────────────────────
SafeTrace monitors fifty cameras across six Delhi-NCR zone types — industrial,
isolated, parking, transit, market, and residential — each feeding structured
events into an Elasticsearch safety_events index in real time. Every event
carries a camera ID, zone type, geographic coordinates, risk score from the
edge vision model, gender counts, an alone flag, an SOS gesture flag, and a
timestamp. These are not logs. They are timestamped intelligence signals
waiting to be reasoned about.
Elastic Watcher polls the safety_events index on a sub-minute schedule,
running a query that fires when risk_score > 70 or gesture_sos == true
within the last sixty seconds. When the condition is met, Watcher's Painless
transform extracts the highest-risk hit from that window and posts it as a
structured JSON payload to SafeTrace's backend webhook. This is the detection
layer. It does not reason. It watches, detects, and hands off — which is
exactly what it is designed to do.
The reasoning happens in Elastic Agent Builder. The agent receives the alert
context and begins a three-tool investigation. The first tool,
fetch-zone-historical-baseline, queries ninety days of historical data in the
alert_log index to compute the zone's escalation rate — how often past alerts
in this zone became confirmed incidents. The second tool,
correlate-adjacent-cameras, runs an ST_DISTANCE() geospatial query against
the camera_registry index to find every camera within five hundred metres,
placing them on elevated alert as corroborating sensors. The third tool,
calculate-composite-risk, runs an ES|QL EVAL chain that multiplies the raw
risk score by four factors: the zone's base risk multiplier, the surrounding
gender ratio signal, the historical amplifier derived from Tool 1's escalation
rate, and a night-hour penalty when the event occurs between 20:00 and 06:00.
The final score and threat level come out of Elasticsearch — not out of
application logic.
When gesture_sos is true, the agent skips all three tools entirely. A human
distress signal is not a datapoint to be weighed against a composite formula.
The system prompt encodes this as an architectural constraint: gesture_sos
produces threat_level = CRITICAL and final_score = 100 immediately. Every
other path through the agent ends with a written agent_reasoning field —
one to three sentences that name the specific escalation rate and composite
score that drove the dispatch decision. For a law enforcement application,
every patrol dispatch must be traceable. SafeTrace makes traceability a
property of the data layer, not an afterthought.
The Svelte 5 dashboard renders all fifty cameras as Google Maps Advanced Markers colour-coded by current threat level. When an alert fires, the agent trace panel replays each investigation step with staggered animation — tool calls, results, reasoning, dispatch — so a control room operator can see not just that a patrol was dispatched, but why. A patrol unit then animates along stored road waypoints from the nearest police station to the camera location. The alert card shows the agent's written justification, the adjacent cameras activated, the composite score breakdown, and outcome controls for the operator to acknowledge and resolve.
HOW I BUILT IT
─────────────────────────────────────────────────────────────────────────────
The system is built on three Elasticsearch indexes with distinct structural
roles. The safety_events index is a geo-point time-series store — every
document has a location field mapped as geo_point, enabling native
geospatial queries without a separate service. The alert_log index carries
both live alert records and ninety days of pre-seeded historical data flagged
with is_historical: true, giving the historical baseline query statistical
density from day one. The camera_registry index is a static geographic
registry storing each camera's coordinates, zone metadata, patrol waypoints,
and the coordinates of the nearest police station — a single geo lookup
replacing what would otherwise be a relational join across services.
The Agent Builder tool chain is where Elasticsearch's query primitives do the
analytical work directly. The historical baseline tool queries alert_log
with a COUNT_IF aggregation to compute escalation_rate = incident_count /
total_alerts per zone — a real statistical variance measure across 180,000
seeded documents. The adjacent camera tool issues a single query:
ST_DISTANCE(location, TO_GEOPOINT(?wkt)) <= 500, sorted by distance,
limited to five. No Haversine library. No separate geo-service. One query
clause against a geo_point field. The composite risk tool runs an ES|QL
EVAL chain entirely inside Elasticsearch:
final_score = risk_score
× zone_risk_multiplier ← from camera_registry zone profile
× surrounding_ratio_penalty ← computed from male/female counts
× historical_amplifier ← derived from Tool 1 escalation_rate
× night_hour_penalty ← applied when hour_of_day ∈ [20, 6]
| EVAL threat_level = CASE(
final_score >= 88, "CRITICAL",
final_score >= 70, "HIGH",
final_score >= 50, "MEDIUM",
"LOW"
)
The intelligence is inside Elasticsearch. The Python backend receives a typed result, not a number it has to reason about.
Elastic Watcher is configured with a Painless transform that prevents burst: when multiple cameras in a window exceed the threshold, the transform extracts the single highest-scoring hit and posts one payload to the webhook. This is not a filter — it is a deliberate architectural choice to prevent the Agent Builder from receiving simultaneous invocations for the same event window. Watcher's job is to detect and hand off exactly once per meaningful window.
The Python FastAPI backend is typed end-to-end with Pydantic models that match
Zod schemas on the Svelte 5 frontend. The response_parser.py module handles
Kibana's actual /converse response shape, which required a custom
normalisation layer to resolve. The backend also includes a synthetic event
generator with three modes — normal, anomaly, and SOS — that produce
calibrated safety_events documents for demo purposes, with score ranges
tuned to each mode to consistently trigger the correct agent paths.
The agent setup is fully automated: setup_runner.py registers all three
ES|QL tools and the agent against Kibana's API idempotently, and
setup_watcher.py registers the Elastic Watcher definition. The entire
backend can be reproduced from python -m agent.setup_runner followed by
python -m scripts.setup_watcher.
CHALLENGES I RAN INTO
─────────────────────────────────────────────────────────────────────────────
The deepest integration challenge was Kibana's actual /converse response
shape. The documented expectation was a flat array of typed step objects.
The production response was structurally different: tool call steps embedded
their results as a nested results[] array inside the same object, rather
than appearing as separate tool_result entries in the step sequence. The
frontend's trace panel replay depended on receiving discrete, typed steps —
tool_call, tool_result, reasoning, decision, dispatch — in order. A flat
pass-through of the Kibana steps array would have produced a broken trace
animation.
The fix was a custom normalisation layer in response_parser.py:
map_kibana_steps_to_safetrace_trace_steps() iterates the raw Kibana steps,
detects tool_call objects with a populated results[] array, and emits
them as two sequential steps — a tool_call step followed by a synthesised
tool_result step extracted from results[0].data. The decision and dispatch
steps, which Kibana does not emit at all, are synthesised from the agent's
JSON summary. This was not a documentation gap I worked around — it was a
production API behaviour I had to understand, instrument, and solve. It taught
me to treat Elastic's APIs as production systems with real response complexity,
not demo endpoints.
The second challenge was the historical_amplifier formula design. The
amplifier needed to produce meaningful score differentiation between zones with
high and low escalation histories without making low-baseline zones structurally
incapable of reaching dispatch threshold. The formula I settled on uses
1.0 + (escalation_rate × 0.8) as the amplifier multiplier, which means a
zone with zero historical escalation applies a neutral 1.0× multiplier while
a zone with 40% escalation history applies 1.32×. A raw score of 72 in a
low-baseline zone produces a composite of approximately 72. The same score in
a high-baseline zone produces approximately 95. The formula is not
stress-tested across every edge case at this stage — what is confirmed is that
it avoids the two failure modes: inflating low-signal zones to dispatch, and
suppressing valid signals in historically quiet zones.
The third challenge was the Watcher-to-Agent-Builder invocation chain. Getting
Watcher's webhook action to correctly authenticate against SafeTrace's backend
and having the backend correctly pass the Painless-rendered Mustache payload to
the Agent Builder /converse endpoint required coordinating three
authentication contexts simultaneously: the Elasticsearch API key for Watcher,
the backend's own X-API-Key middleware, and the Kibana API key for Agent
Builder. Each context is correct in isolation. Making them compose correctly
end-to-end — with the Watcher token exempt from the API key middleware while
the Kibana key is injected by the agent invoker — required explicit
architectural decisions about which authentication belongs at which layer.
ACCOMPLISHMENTS I'M PROUD OF
─────────────────────────────────────────────────────────────────────────────
The accomplishment I am most confident in is the composite risk formula that
makes ninety days of historical zone data change real dispatch outcomes. A rule
engine fires on risk_score > 70. SafeTrace fires differently on risk_score =
72 depending on the zone's measured escalation history. These are not the same
system. One responds to a threshold. The other responds to context. The
distinction is provable with specific numbers: the same raw signal produces a
composite of 72 in a low-baseline zone and 95 in a high-baseline zone. The
dispatch decision is different. The written justification is different. The
zone history that produced the difference lives in Elasticsearch and is
queryable. That is not a feature — it is the entire architectural argument for
building this on Elasticsearch rather than a rules platform.
I am also proud of how ST_DISTANCE() resolved what would otherwise have been
a multi-service problem. Correlating adjacent cameras within five hundred metres
as corroborating evidence is a production-grade geospatial operation. Without
native geo support, this would require a separate service, a Haversine
computation layer, and a consistency boundary between them. The query is one
clause. The result is a typed list. The architectural surface area of the
entire SafeTrace system is smaller because Elasticsearch treats geo as a
first-class query primitive — and that simplification is only available because
I built on Elastic's stack.
The Watcher-as-orchestrator separation is an accomplishment I could not have articulated before this project. Watcher watches data. Agent Builder reasons about data. Making that separation explicit — encoding it as a system boundary rather than a convention — is what makes SafeTrace maintainable. Every patrol dispatch carries a written justification traceable to specific Elasticsearch queries. That is auditable by design, not by accident. For a law enforcement application, that distinction is the difference between a tool that can be deployed and one that cannot.
WHAT I LEARNED
─────────────────────────────────────────────────────────────────────────────
SafeTrace taught me things about Elasticsearch that I could not have learned from any other project, because the specific combination of real-time time-series data, geospatial correlation, and historical intelligence amplification does not appear in tutorials or documentation examples.
The most significant learning was ES|QL as a real-time intelligence layer. I
had used ES|QL for aggregations before. SafeTrace was the first time I used it
for multi-dimensional sliding window queries — gender ratio cross-referenced
with time-of-day, zone type, and location in a single statement. The EVAL
chain in Tool 3 — multiplying four factors inside Elasticsearch and receiving
a typed threat_level as output — changed how I think about where computation
belongs in a real-time system. Computation that lives inside Elasticsearch
travels with the data. It does not cross a network boundary. It does not depend
on application state. For a real-time safety system where latency between
signal and dispatch matters, keeping the scoring logic inside the data layer
is not an optimisation — it is a correctness requirement.
Elastic's native geo capabilities changed my architecture instincts. Before
SafeTrace, I would have reached for a separate geo-service for any
production-grade geospatial operation. ST_DISTANCE() inside an agent tool
replaced that service entirely. The lesson is not that ST_DISTANCE() exists —
it is that Elasticsearch's geo support is mature enough to be the sole
geospatial layer in a system that needs to make dispatch decisions under time
pressure. I now think about Elasticsearch's field types — geo_point,
semantic_text, date — as architectural decisions, not just storage choices.
Elastic Watcher taught me a mental model I will carry into every event-driven system I build. Watcher is not a monitoring tool. It is an event-driven trigger layer with temporal awareness built in. The correct mental model is: Watcher watches data, Agent reasons about data, and the boundary between them is a deliberate architectural constraint. Collapsing both into the agent — having the agent poll its own index on a timer — would make both functions harder to tune, debug, and audit independently. This hackathon taught me that separation of concerns in AI systems is not just a software engineering principle — it is what makes those systems explainable to the people who depend on them.
The Kibana /converse API depth was the most technically honest learning
moment. Production APIs have response shapes that differ from documented
expectations. steps[].results[] embedded inside tool_call objects was not
in any example I found before hitting it. Writing response_parser.py to
handle the actual response shape — not the expected one — taught me to build
against production API behaviour from the start, not against documented
examples. That habit will make every Elastic integration I build more robust.
WHAT'S NEXT FOR SAFETRACE
─────────────────────────────────────────────────────────────────────────────
SafeTrace is architecturally complete. The intelligence layer — ingestion, detection, historical amplification, geospatial correlation, composite scoring, Agent Builder reasoning, patrol dispatch, audit trail — is built, typed, tested, and deployable. What is not built is the vision layer: real-time gender classification, lone-person detection, and SOS gesture recognition from a live camera feed. That is a computer vision pipeline. It is not Elasticsearch's responsibility, and it is not SafeTrace's responsibility. SafeTrace is the intelligence layer that consumes structured signals from that pipeline. The boundary is correct. Swapping a synthetic event generator for a real vision inference output requires no changes to Elasticsearch, the agent, or the dispatch logic.
The next step is integration conversations. I intend to reach out to Delhi Police, NDMC, and state government smart city initiatives to understand where SafeTrace fits within their existing CCTV infrastructure and what an integration path looks like practically. This is not a pitch — it is a technical conversation about whether the vision layer can be connected, what data governance constraints apply, and what a pilot deployment would require.
Long-term, I want to expand coverage beyond Delhi-NCR to other Indian cities with existing smart city infrastructure, integrate real-time PCR van GPS tracking into Elasticsearch for precise patrol ETA calculations, and build a multilingual alert interface for control room operators who work in Hindi and regional languages rather than English.
I will maintain this project because the problem does not stop. Incidents emerge. Control rooms respond after the fact. The gap between signal and response is not a technical unsolved problem — the technical solution exists and is running. The remaining work is institutional. SafeTrace exists to close that gap, and I intend to continue closing it for as long as the gap remains.
Built With
- elasticsearch-ai-agent
- es|ql
- fastapi
- python
- svelte5
- typescript
Log in or sign up for Devpost to join the conversation.