Skip to content

test: accept structural goldens in place with PSLUA_GOLDEN_ACCEPT#106

Merged
Unisay merged 1 commit into
mainfrom
chore/golden-accept
Jun 21, 2026
Merged

test: accept structural goldens in place with PSLUA_GOLDEN_ACCEPT#106
Unisay merged 1 commit into
mainfrom
chore/golden-accept

Conversation

@Unisay

@Unisay Unisay commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

What this does

Adds an --accept-style escape hatch to the golden harness. When a codegen or
optimizer change legitimately moves the generated IR or Lua, you can rewrite the
affected goldens in place instead of deleting them by hand:

PSLUA_GOLDEN_ACCEPT=1 cabal test spec

A mismatching golden is overwritten with the actual output and the test passes,
the same way tasty-golden --accept does.

The safety asymmetry

Acceptance is opt-in per golden. The harness gains acceptableGolden, a variant
of defaultGolden that carries an acceptable :: Bool flag:

  • golden.ir and golden.lua are derived: a pure function of the code under
    test. They use acceptableGolden, so PSLUA_GOLDEN_ACCEPT may rewrite them.
  • eval/golden.txt is the hand-verified semantic oracle. It stays on
    defaultGolden (acceptable = False), so it is never auto-accepted. A
    program-output change still fails the run until you fix the regression or
    update the oracle on purpose.

This is the guard scripts/golden_reset lacks: that script deletes every
golden.*, the eval oracle included, which is how a past regression slipped
through unnoticed. PSLUA_GOLDEN_ACCEPT only touches the structural goldens.

Docs

Adds docs/GOLDEN_TESTING.md: what each artifact is, why the eval golden is the
oracle, the accept / reset / full-diff knobs, and how to add a new golden test.
Also adds a CLAUDE.md note that the golden harness re-implements the IR
pipeline in compileCorefn, so any new IR pass must live inside
optimizedUberModule or the goldens silently bypass it.

Scope

Test infrastructure only, no compiler or runtime change. Independent of #105
(the magic-do fix for #46); this branch is cut from main.

Verification

  • cabal test spec: 233/0 on main plus this branch.
  • Checked the asymmetry directly: perturbed both golden.ir and
    eval/golden.txt for one module, then ran with PSLUA_GOLDEN_ACCEPT=1. The
    golden.ir was accepted and rewritten back to the regenerated output (clean
    git diff), while the eval/golden.txt test still failed and the file was
    left untouched.
  • fourmolu, hlint, nix fmt clean on the touched files.

Adds `acceptableGolden`, a variant of `defaultGolden` whose mismatch may be
accepted — the golden file is rewritten in place and the test passes — when
the env var PSLUA_GOLDEN_ACCEPT is set, à la tasty-golden's --accept.

Only the derived/structural goldens (golden.ir, golden.lua) are marked
acceptable. The hand-verified eval/golden.txt oracle stays on defaultGolden
(acceptable = False) and is therefore never auto-accepted: its program output
must change only by deliberate review. This is the safety asymmetry that
scripts/golden_reset lacks (it deletes the eval oracle too).

Replaces hand-deleting goldens for ordinary codegen/optimizer churn:
  PSLUA_GOLDEN_ACCEPT=1 cabal test spec

Also adds docs/GOLDEN_TESTING.md (workflow, artifacts, the eval oracle, and
the accept/reset/debug knobs) and a CLAUDE.md note that the golden harness
re-implements the IR pipeline (passes must live in optimizedUberModule).
@Unisay Unisay merged commit 04658a2 into main Jun 21, 2026
2 checks passed
@Unisay Unisay deleted the chore/golden-accept branch June 21, 2026 10:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant