RCP Anomaly Detection — Automated ILI Alignment, Matching, and Growth Analytics

Inspiration

Pipelines are lifelines. Yet, inspection data collected years apart rarely lines up: odometer drift, differing references, and evolving corrosion make apples‑to‑apples comparisons hard. Engineers still match defects by eye—slow, costly, and inconsistent. We were inspired to automate the way experts think: align runs, match the same physical defects, quantify growth, and surface new or missing anomalies so teams can act faster and safer.

What it does

Aligns two inspection runs (e.g., 2007→2015 and 2015→2022) to a common coordinate system, correcting distance drift.
Matches the same physical defects across runs with confidence bands (High/Med/Low).
Computes growth metrics (depth, length, width), normalized by time between inspections.
Flags new defects, missing defects, and uncertain matches for review.
Delivers an interactive, single‑page dashboard (no iframes) and an iframe version for stakeholder sharing.

How we built it

The end‑to‑end workflow is orchestrated in ili_eda.ipynb and produces web dashboards and a presentation.

Data preparation and alignment

Standardize units and feature types; normalize clock positions and handle missing fields.
Use anchors (girth welds, valves, fittings) to correct drift and map distances to a common axis.
Linear drift correction (conceptual): $$x_2^{*} = a\,x_2 + b$$ Parameters $(a,b)$ are estimated to minimize anchor residuals. For segments needing more flexibility, piecewise or non‑linear transforms can be applied.

Anomaly matching and confidence

For each defect in Run 2, search candidate defects in Run 1 within a distance window; compare corrected distance, clock, type compatibility, and size.
Example similarity score (conceptual): $$S = w_d \exp!\left(-\tfrac{|x_1 - x_2^{*}|}{\tau_d}\right) \;+ w_c \cos(\Delta\theta) \;+ w_s \exp!\left(-\tfrac{\lVert s_1 - s_2 \rVert}{\tau_s}\right)$$
Assign best matches; flag uncertain/new/missing. Confidence bands are thresholded (e.g., High: $c\ge0.8$, Med: $c\ge0.5$, Low otherwise).

Growth analytics

For matched defects, compute growth normalized by elapsed time $\Delta t$: $$g_{\text{depth}} = \frac{d_2 - d_1}{\Delta t},\quad g_{\text{length}} = \frac{\ell_2 - \ell_1}{\Delta t},\quad g_{\text{width}} = \frac{w_2 - w_1}{\Delta t}$$
Highlight fastest‑growing anomalies for prioritization.

Interactive dashboards and outputs

Interactive metrics (depth, length, width) with web‑side histogram bin toggles (20/30/50/80).
Separate color scales per scatter; neat, aligned layout.
Alignment UI shows full matched pairs (no sampling), confidence filters, and show/hide unmatched.
Outputs:
- Inline single‑page dashboard: outputs/dashboard_all_in_one.html
- Iframe consolidated dashboard: outputs/dashboard_all_in_one.html
- Alignment pages: outputs/ili_alignment_2007_to_2015.html, outputs/ili_alignment_2015_to_2022.html
- Winter‑themed deck: outputs/presentation_company_overview_winter_v2.pptx

Challenges we ran into

Cross‑run drift and inconsistent reference reporting required robust alignment and careful anchor selection.
Varying column names and missing fields caused runtime issues; we added resilient column selection and defaults (e.g., confidence fallback).
Inline embedding initially failed due to notebook scope and a confidence mapping bug—fixed with local helper definition, safe Series mapping, and robust distance column heuristics.
Iframe path visibility concerns were avoided by shipping a self‑contained single‑page dashboard.

Accomplishments that we're proud of

Full matched‑pair alignment visualizations with confidence filters and unmatched toggles.
A polished, single‑page interactive dashboard that stakeholders can use without setup.
Clear, AI‑style summaries embedded into the experience to accelerate understanding.
A winter‑themed executive deck suitable for company demos.

What we learned

Alignment across inspections is as much data hygiene as math—unit/format normalization matters.
Modular separation (alignment → matching → growth → UI) keeps the system understandable and testable.
Small UI details (bin toggles, per‑plot color bars, tidy layout) dramatically improve analyst speed.
Build for resilience: default confidence bands, flexible column selection, and a self‑contained inline page prevent demo‑time surprises.

What's next for RCP Anomaly Detection

Validation dataset (as mentioned) will let us strengthen the pipeline and benchmark accuracy.
Today’s approach is largely rule‑based due to limited data; with past trends and labels, we’ll evolve to an ML matching pipeline (learned similarity + confidence estimation).
Add unmatched counts in subtitles; consider tabbed layouts and richer filtering.
Publish internally with SSO and automate refresh with each new ILI run.
Scale to multiple segments and add health scoring for risk‑based prioritization.

Built with

Languages & Libraries: Python 3.13, Pandas, Plotly (graph_objects/express), python‑pptx
Environment & Tools: Jupyter Notebook (ili_eda.ipynb), VS Code, virtualenv on macOS
UI/Controls: Web‑side Plotly buttons (bin toggles), confidence/unmatched filters
Optional AI: OpenAI summaries (with local fallback)
Data: CSVs per run (final results, matches/new/missing/aligned)

Try it locally (optional)

# Open the single-file dashboard
open /Users/satvikaakati/Desktop/Hack/outputs/dashboard_all_in_one_inline.html

# Open the winter-themed presentation
open /Users/satvikaakati/Desktop/Hack/outputs/presentation_company_overview_winter_v2.pptx

Built With

confidence/unmatched-filters-??-?optional-ai:-openai-summaries-(with-local-fallback)-??-?data:-csvs-per-run-(final-results
pandas
plotly-(graph-objects/express)
python
python?pptx-??-?environment-&-tools:-jupyter-notebook-(??ili-eda.ipynb??)
virtualenv-on-macos-??-?ui/controls:-web?side-plotly-buttons-(bin-toggles)
vs-code

Updates

Siva Sai Deepank Manoj started this project — Feb 08, 2026 12:56 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.