jac migrate — project description
Inspiration
Jac programs persist graphs in local SQLite storage as pickled anchor blobs. When you evolve a schema—adding new has fields to node or edge types—older rows still unpickle, but the restored archetype instances may not have those attributes, so code that reads the new fields can fail at runtime (for example AttributeError). That gap is familiar from SQL migrations, but here the “rows” are live archetype objects inside pickles, not flat columns. We wanted a first-party workflow so developers can discover drift, generate migration stubs, and apply upgrades without hand-editing the database or rewriting history.
What it does
The jac migrate command (registered under the CLI project group) supports three actions:
status— Resolves the project’s SQLite file fromjac.tomland the filesystem (including<name>.db, optionalshelf_db_path,anchor_store.db, a single.jac/data/*.db, or legacy~/.jac/data/<name>.db), then prints whether it exists, lists.jac/migrations/*.py, and shows applied vs pending using.jac/migrate_applied.txt.generate— Parses project.jacsources withJacProgram, builds the current node/edgehasmap from the AST, primes the runtime so pickles that reference__main__archetypes can load, scans theanchorstable, unpickles blobs, and diffs declared fields vs fields seen on instances. For missing fields it proposes fills (literals /nullfor “empty” defaults). It writes a numbered Python migration under.jac/migrations/(for example0001_auto_Item.py) containingMIGRATION_IDandAUTO_FILLS(JSON embedded in the file). There is no separatejacmigrate.tomlspec file in this iteration.apply— Loads pending migration modules (skips IDs already listed inmigrate_applied.txt), backs up the.dbto a timestamped.bakcopy by default, then for each anchor row: unpickle →setattrmissing fields fromAUTO_FILLS→ write the blob back.-d/--dry_runreports how many rows would change without writing;-B/--no_backupskips the file copy. Applied migration IDs are appended tomigrate_applied.txt(not ajac_migrationstable or spec hash).
Fields without a literal initializer are still filled with None in the generated stub so you can edit AUTO_FILLS before apply if you need a real default.
How we built it
- Schema + DB diff:
jaclang.migrate.engineusesJacProgram.parse_strand walks UniTreeArchetypenodes to collecthasnames, defaults where they are literals, and merges schemas across files.scan_db_field_usagereadsanchors.data, unpickles, and unions attribute names onNodeAnchor/EdgeAnchorarchetypes. - Unpickle context:
prime_unpickle_contextuses the project’sentry-pointfromjac.toml(then other*.jaccandidates) withproc_file+Jac.jac_import(..., override_name="__main__")so__main__.Item-style pickles resolve during scan and apply. - CLI glue:
migrate.jacregisters the command;migrate.impl.jaccallsjaclang.migrate.engine.run_cliand returns the exit code. - Supporting runtime tweak:
JacRuntime.base_path_dirdefaults toNoneso persistence is not accidentally anchored to an unrelatedcwdwhenjac runis invoked from another directory—keeping DB location aligned with whatmigrateresolves. - Tests: The intended story is covered by a small demo project (v1 seed → v2 failure → generate/apply → v2 success); automated
jac testintegration tests are a natural follow-up.
Challenges we ran into
- Where is the DB? Vanilla Jac vs plugins (e.g. jac-scale) may use
migrate-demo-issue.db,anchor_store.db, or a configured path—migratehad to implement a clear resolution order and helpful error hints. - Pickles need the right module context: Without importing the project’s Jac,
pickle.loadscould succeed but archetypes would not match, or resolution would fail—priming__main__from the entry file was essential. - Defaults in the AST: Only literal defaults are carried into
AUTO_FILLSautomatically; everything else becomesnullin JSON until the developer edits the generated script.
Accomplishments that we're proud of
- Human-readable migration scripts (Python + embedded JSON) that you can review and edit before
apply. statusthat shows which DB file is in use and which scripts are pending—quick answers before touching data.- Automatic DB backup and dry-run for
apply, in the spirit of safe SQL-style workflows. - End-to-end demo (v1 → v2 break → migrate → v2 works) showing the tool closes a real OSP / SQLite pickle gap.
What we learned
- Graph persistence in Jac is powerful but version-sensitive; a small CLI beats one-off pickle scripts once schemas churn.
- Compiler metadata (UniTree + archetype declarations) is a good source of truth for “what fields exist now,” while runtime import must be used only to make unpickling faithful, not to accidentally run application
with entrylogic in the migrate path. - SQLite is a fine store for read–modify–write over many BLOBs when updates are explicit, logged, and paired with backup and skip behavior for bad rows.
What's next
The core apply path should stay deterministic and reviewable. On top of that, LLM integration (including Jac’s by llm / Meaning-Typed Programming) is a natural extension:
- Assistive
generate: Given AST diff and DB field usage, an LLM suggests richerAUTO_FILLS, rename notes, or inline comments in the migration file; humans review beforeapply. - Explainer: After
statusorgenerate, an LLM summarizes what drifted, what the migration will do, and risks in plain language. - Agentic wrapper: A small Jac workflow that uses
jac migrateas a tool—multi-step flows such as detect drift → propose or refine migration →apply -d→ thenapply—good fit for agentic demos while the engine remains rule-based.
Guardrail: LLM output should be treated as draft; never silently rewrite pickles without a checked-in migration artifact. apply stays the trusted, non-LLM step.
Other roadmap items (non-LLM):
- Richer operations than fill-missing (renames, removals, splits, custom hooks per archetype) with validation.
- A dedicated
plan/ preview action that summarizes impact without loading full migration modules, plus optional JSON output for CI. - Optional SQL-style metadata (e.g. applied migrations table + checksums) for teams that outgrow a text file.
- Tighter integration with
jac run, project profiles, and docs on Jac for when to generate vs. apply in team workflows. - Automated tests in-tree and version-compatibility notes for long-lived databases.
Project context: Jaseci · Jac language & tooling.
Built With
- claude
- cursor
Log in or sign up for Devpost to join the conversation.