FieldFix — Autonomous Site Ops Agent for Remote Field Teams

Inspiration

Remote infrastructure failures — in telecom sites, solar farms, agricultural sensors, and disaster-response equipment — cause costly delays because diagnosing and coordinating repairs requires specialized knowledge, parts lookup, and scheduling across disparate systems. We wanted to build an agent that acts like a seasoned field ops lead: it observes, reasons, runs safe diagnostics, and executes low-risk actions so humans can focus on high-value decisions.

What it does

FieldFix ingests photos, logs, and telemetry, then uses multimodal reasoning to diagnose likely root causes and produce a prioritized action plan. It can:

Run sandboxed diagnostic scripts against telemetry or simulated environments.

Reserve or order replacement parts from inventory/catalog APIs.

Propose and book technician time slots and notify stakeholders (SMS/email).

Persist incident memory (past fixes, manuals, configuration) for faster, contextual decisions. Human confirmation gates exist for high-cost or safety-sensitive actions.

How we built it

Models & agent runtime: Amazon Bedrock / SageMaker + AgentCore primitives (memory, gateway, code interpreter).

Integrations: Lambda & API Gateway wrappers for telemetry, inventory/order, scheduler, and notifications.

Storage & observability: DynamoDB for incidents/inventory, S3 for artifacts, CloudWatch for logs/metrics.

Frontend: lightweight web/mobile UI to upload images/logs and view decision traces. We wired tools into the agent via a Gateway so the LLM can plan and execute composite actions end-to-end.

Challenges we ran into

Multimodal noise: field photos vary widely; needed robust pre-processing and fallback prompts to request clarifying input.

Safe execution: running diagnostics required strict sandboxing and strict IAM policies to avoid unintended actions.

Tool orchestration: designing deterministic tool responses and clear API contracts so the agent could reliably chain actions.

Budget & latency: balancing demo realism with limited compute credits required careful simulation of some integrations.

Accomplishments that we're proud of

End-to-end demo: autonomous flow from image + log → diagnosis → sandbox diagnostic → parts order → booking confirmation.

Contextual memory: agent successfully used past incident data to change diagnosis and avoid unnecessary replacements in demos.

Transparent traces: decision reasoning and tool calls were auditable, making the agent’s actions explainable to judges and operators.

Minimal human prompts: agent completed multi-step incidents with just an initial upload and a single confirmation for sensitive steps.

What we learned

Explicit tool contracts and deterministic outputs are crucial for reliable autonomy.

Multimodal LLMs are powerful but need structured prompts and fallback clarifying questions for real-world variability.

Human-in-the-loop confirmations are essential for safety and operator trust.

Building a useful agent is as much about integration and observability as it is about model reasoning.

What's next for FieldFix — Autonomous Site Ops Agent for Remote Field Teams

Improve vision robustness with domain-specific augmentation and a lightweight on-device prefilter.

Add predictive maintenance: combine telemetry trends and inventory analytics to pre-order parts before failures.

Expand integrations: real vendor ordering APIs, spare-parts logistics tracking, and on-call rota sync.

Deploy pilot with a small partner (telecom or solar operator) to collect real incident data and measure MTTR improvements in production.

Built With

amazon
docker
fastapi
lambda
opencv
python
pytorch
react
s3
tailwind
trasnformers
typescript

Updates

Anurag Singh started this project — Sep 19, 2025 01:29 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.