diff --git a/conductor/tracks/test_batching_refactor_20260606/spec.md b/conductor/tracks/test_batching_refactor_20260606/spec.md index 6de1d317..c4dd25cd 100644 --- a/conductor/tracks/test_batching_refactor_20260606/spec.md +++ b/conductor/tracks/test_batching_refactor_20260606/spec.md @@ -217,7 +217,7 @@ The `plan` function is deterministic. The same `records` + same `options` produc Responsibilities (slim, delegates everything else): 1. Parse CLI args (`--tiers`, `--include-opt-in`, `--plan`, `--audit`, `--no-xdist`). 2. Call `categorize_all(tests_dir, registry_path)`. -3. If `--audit`: print records where `source == "auto"`, exit non-zero if any have empty subsystem lists or other hard errors. Exit 0 if every record is well-formed even if some are auto-inferred. +3. If `--audit`: print records where `source == "auto"`, exit non-zero if any have empty subsystem lists or other hard errors. Exit 0 if every record is well-formed even if some are auto-inferred. If `--audit --strict`: additionally exit non-zero if any auto-classified file has multiple subsystems (heuristic for "probably cross-cutting — should be in the registry"). 4. If `--plan`: print the batch list (one row per batch with label, files, estimated seconds) and exit. 5. Otherwise: call `plan()`, iterate batches, run each as `subprocess.run(uv + pytest + pytest_args + files)`, accumulate per-batch results, print the summary table. 6. Return the worst per-batch exit code (0 only if all batches pass).