POLARIS: Policy-Aligned Local Review Inference System

Inspiration

The inspiration for POLARIS came from the frustration of reading through unreliable online location reviews (ads, off-topic rants, or complaints from people who never visited the place). We wanted a transparent and privacy-preserving way to enforce clear policies locally, without relying on cloud moderation. By combining a small local LLM with strict formatting and CSV tooling, POLARIS shows how clear rules and practical design can restore trust in location reviews while remaining easy for anyone to run and extend.

What it does

Takes a location and review as input and outputs:
- Decision: Valid or Flagged
- Primary Violation: No Advertisement, No Irrelevant Content, No Rant Without Visit (or blank if valid)
- Explanation: 1–2 sentence rationale
Works in two modes:
- Interactive: type a single review
- Batch: process a CSV and output an *_evaluated.csv file
Runs locally using a small LLM via Ollama + LangChain.

How we built it

1) Local LLM + policy-aligned prompt

OllamaLLM(model="llama3.2") with a single prompt that describes the three policies and a fixed, 3-line output format.

2) A tiny, composable chain

chain = prompt | model keeps the runtime simple and makes the model/prompt swappable.

3) CSV-friendly plumbing (auto column detection)

_normalize canonicalizes header names; _guess_columns tries common variants (e.g., place, text, comment).
Falls back to manual selection if auto-detection fails. Uses csv.DictReader with UTF-8-SIG handling.

4) Strict but forgiving output parsing

_parse_model_output extracts Decision, Primary Violation, Explanation.
Missing fields default to empty strings so the pipeline doesn’t crash.

5) Single-review wrapper

evaluate(location, review) invokes the chain and returns (decision, violation, explanation, raw_output).

6) Batch pipeline (CSV input and CSV output)

process_csv(path) loops rows, evaluates each review, and appends new columns to *_evaluated.csv.
Skips empty review cells but still writes the row.

Challenges we ran into

LLM formatting drift: occasional deviations from the 3-line format; mitigated with a stricter parser.
Ambiguity in real reviews: borderline cases between “irrelevant” vs “rant/no-visit” require clearer policy text.

Accomplishments that we’re proud of

A working local prototype with interactive and batch modes.
A single, readable prompt and a small amount of code that others can run.
A mismatch report workflow (in the evaluator script) to help iterate on the prompt/policy.

What we learned

Clear, concrete policy wording matters more than clever prompting.
A forgiving parser is essential when relying on LLM-formatted text.
Simple ergonomics (progress logs, column guessing) reduce friction during testing.

What’s next for POLARIS

Stricter outputs: switch the prompt to JSON-only and parse with json.loads (with a text fallback).
Confidence/abstain: add a low-confidence path that flags uncertain rows for human review.
Policy refinement: add short borderline examples to reduce confusion between violation types.
Lightweight UI: add a simple desktop/web interface for interactive review entry, drag-and-drop CSVs, progress/status, and one-click export of the evaluated csv output.