Inspiration

Strategic planning often stalls between symbolic intuition (vision, values, archetypes) and operational execution (OKRs, sprints, risk controls). YiFlow tests whether open-weight LLMs can compile culturally meaningful archetypes into structured, reviewable action plans while running entirely offline for privacy and accessibility.

What it does

Given a selected hexagram (e.g., 坎 Kǎn — Risk / Uncertainty / Trial & Error) and a user goal, YiFlow outputs four artifacts:

OKR — one Objective with measurable Key Results (KR1…KRn).

14-day sprint — day-by-day plan with owners and acceptance notes.

Risk register — Risk × Probability × Impact × Mitigation.

Stakeholders & cadence — roles, channels, and comms schedule.

Two delivery modes

Local agent (offline): Python + Hugging Face Transformers, loading gpt-oss-20b from disk (no network calls).

Web demo (frontend-only): static UI to preview the same artifact layouts for demos (no model inference).

Language/UI rule: hexagram names stay in Chinese (pinyin — English gloss); all other UI and outputs are in English.

How we built it

Model & loading (offline)

Local files only via Transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM

tok = AutoTokenizer.from_pretrained( MODEL_DIR, local_files_only=True, trust_remote_code=True ) model = AutoModelForCausalLM.from_pretrained( MODEL_DIR, subfolder="original", local_files_only=True, trust_remote_code=True )

Offline runtime (no hub lookups):

HF_HUB_OFFLINE=1

TRANSFORMERS_OFFLINE=1

(Windows PowerShell) $env:HF_HUB_OFFLINE='1'; $env:TRANSFORMERS_OFFLINE='1'

Repo layout expectations (for local weights)

MODEL_DIR/ config.json # includes: "model_type": "gpt_oss" original/ # weight shards (e.g., .safetensors or *.bin) tokenizer.json / vocab. # tokenizer files used by the model

If your tokenizer relies on SentencePiece, include sentencepiece in requirements; otherwise not needed.

Prompt design

Instruction-first prompts that bind hexagram metadata to fixed schemas (OKR / Sprint / Risk / Stakeholders) to force consistently structured sections.

Lightweight controls (e.g., temperature, top-p) and a deterministic seed for reproducibility when desired.

Runtime

CPU/GPU compatible via PyTorch + Transformers.

Environment variables guard against accidental HTTP calls.

Simple generation API (max tokens, stop sequences for section boundaries).

UI

Local agent: Gradio app (app.py) to enter goal + hexagram, preview artifacts, and export.

Web demo: React + Tailwind static page rendering the same four artifact blocks; no inference, purely for showcasing the format.

Challenges we ran into

Weight layout & configs: many users only place original/ weight shards; Transformers also expects a valid root config.json (with "model_type": "gpt_oss"). Setting subfolder="original" prevents “unrecognized model” errors.

Strict offline: once hub access is disabled, missing tokenizer files or hidden dependencies surface immediately. We documented required files and environment flags to keep runs fully local.

Output reliability: unconstrained models drift; schema-bound prompts plus basic decoding constraints improved consistency without external tooling.

Accomplishments we’re proud of

A repeatable compiler from symbolic archetypes to execution-ready artifacts (OKR, sprint, risks, comms) — not just inspirational copy.

A privacy-preserving workflow that runs entirely offline on commodity Windows machines.

A clean bilingual UX: Chinese hexagram names preserved with pinyin and English gloss for international reviewers.

What we learned

Open-weight LLMs (e.g., gpt-oss-20b) can be steered to structured planning when prompts encode both domain templates (OKR/sprint/risk) and cultural priors (hexagram attributes).

Robust offline operation is more about file integrity and loader args than raw compute — small path or config mistakes cause outsized failures.

What’s next for YiFlow

Domain packs: swap hexagram metadata for sector playbooks (health ops, disaster response, climate action).

Evidence-aware mode: optional local RAG (on-device docs); responses add source snippets and confidence notes.

Human-in-the-loop editing: interactive KR calibration, risk scoring; export Markdown and ICS already supported.

Evaluation: user studies on plan quality vs baselines; ablations on prompt components.

Built with (current prototype)

Python, PyTorch, Transformers (Hugging Face)

Gradio (local agent UI)

React + Tailwind CSS (static demo UI)

(If tokenizer requires it) sentencepiece

(Optional) accelerate for device placement convenience

GPU use is optional; install the PyTorch build that matches your system if you plan to use CUDA.

Built With

  • accelerate
  • cpu/gpu;
  • huggingface-hub-(offline-mode).-local-ui:-gradio-(?4.31).-web-demo-ui:-react
  • ics
  • lucide-react-(icons).-formats:-markdown-export
  • model-&-inference:-python
  • os/platform:
  • tailwind-css
  • torch
  • transformers-(?4.46)
  • windows
Share this project:

Updates