diff --git a/conductor/tracks/test_batching_refactor_20260606/spec.md b/conductor/tracks/test_batching_refactor_20260606/spec.md
index 6de1d317..c4dd25cd 100644
--- a/conductor/tracks/test_batching_refactor_20260606/spec.md
+++ b/conductor/tracks/test_batching_refactor_20260606/spec.md
@@ -217,7 +217,7 @@ The `plan` function is deterministic. The same `records` + same `options` produc
 Responsibilities (slim, delegates everything else):
 1. Parse CLI args (`--tiers`, `--include-opt-in`, `--plan`, `--audit`, `--no-xdist`).
 2. Call `categorize_all(tests_dir, registry_path)`.
-3. If `--audit`: print records where `source == "auto"`, exit non-zero if any have empty subsystem lists or other hard errors. Exit 0 if every record is well-formed even if some are auto-inferred.
+3. If `--audit`: print records where `source == "auto"`, exit non-zero if any have empty subsystem lists or other hard errors. Exit 0 if every record is well-formed even if some are auto-inferred. If `--audit --strict`: additionally exit non-zero if any auto-classified file has multiple subsystems (heuristic for "probably cross-cutting — should be in the registry").
 4. If `--plan`: print the batch list (one row per batch with label, files, estimated seconds) and exit.
 5. Otherwise: call `plan()`, iterate batches, run each as `subprocess.run(uv + pytest + pytest_args + files)`, accumulate per-batch results, print the summary table.
 6. Return the worst per-batch exit code (0 only if all batches pass).