diff --git a/conductor/tracks/cruft_elimination_20260627/SPEC_CORRECTION_phase_2.md b/conductor/tracks/cruft_elimination_20260627/SPEC_CORRECTION_phase_2.md new file mode 100644 index 00000000..78debf18 --- /dev/null +++ b/conductor/tracks/cruft_elimination_20260627/SPEC_CORRECTION_phase_2.md @@ -0,0 +1,281 @@ +# SPEC CORRECTION: Phase 2 — ProjectContext Field Shape + +**Track:** `cruft_elimination_20260627` +**Phase:** 2 (Fix `flat_config` to return typed `ProjectContext`) +**Date:** 2026-06-27 +**Author:** Tier 1 (post-mortem of VC8 mismatch) +**Status:** Awaiting Tier 2 resumption + +--- + +## TL;DR + +The spec for Phase 2 says: "Add `ProjectContext` to `src/models.py` with all fields observed in `src/project_manager.py:flat_config`." This is underspecified. The actual `flat_config` returns a NESTED dict structure with 6 top-level fields, each with sub-fields. The spec doesn't enumerate which fields belong to `ProjectContext` (a flat dict) vs which are sub-objects. + +This correction specifies the exact schema. Tier 2 can resume Phase 2 directly. + +--- + +## Actual `flat_config` return shape (measured from `src/project_manager.py:268`) + +```python +def flat_config(proj: Metadata, disc_name: Optional[str] = None, track_id: Optional[str] = None) -> Metadata: + ... + return { + "project": proj.get("project", {}), + "output": proj.get("output", {}), + "files": proj.get("files", {}), + "screenshots": proj.get("screenshots", {}), + "context_presets": proj.get("context_presets", {}), + "discussion": { + "roles": disc_sec.get("roles", []), + "history": history, + }, + } +``` + +**Top-level keys** (the `Metadata` dict): `project`, `output`, `files`, `screenshots`, `context_presets`, `discussion` + +**Sub-keys observed in `aggregate.run()`** (`src/aggregate.py:484-525`): + +| Top-level key | Sub-key | Access pattern | +|---|---|---| +| `project` | `name` | `config.get("project", {}).get("name")` | +| `project` | `summary_only` | `config.get("project", {}).get("summary_only", False)` | +| `project` | `execution_mode` | `config.get("project", {}).get("execution_mode", "standard")` | +| `output` | `namespace` | `config.get("output", {}).get("namespace", "project")` | +| `output` | `output_dir` | `config["output"]["output_dir"]` (REQUIRED — direct subscript, not `.get`) | +| `files` | `base_dir` | `config["files"]["base_dir"]` (REQUIRED) | +| `files` | `paths` | `config["files"].get("paths", [])` | +| `screenshots` | `base_dir` | `config.get("screenshots", {}).get("base_dir", ".")` | +| `screenshots` | `paths` | `config.get("screenshots", {}).get("paths", [])` | +| `discussion` | `roles` | (passed through; not consumed by aggregate.run directly) | +| `discussion` | `history` | `config.get("discussion", {}).get("history", [])` | +| `context_presets` | (opaque dict) | (passed through to other consumers; not consumed by aggregate.run) | + +`output_dir` and `files.base_dir` are accessed via **direct subscript** (`config["output"]["output_dir"]`, `config["files"]["base_dir"]`). All other fields use `.get()` with defaults. **Both patterns must be supported** by the dataclass design. + +--- + +## Tier 2's design choice (recommended) + +Use **6 top-level sub-dataclasses**, one per top-level key. Each sub-dataclass has its own fields. This matches the actual nested structure of `flat_config`. + +```python +# src/models.py — add after existing dataclasses + +@dataclass(frozen=True, slots=True) +class ProjectMeta: + name: str = "" + summary_only: bool = False + execution_mode: str = "standard" + + +@dataclass(frozen=True, slots=True) +class ProjectOutput: + namespace: str = "project" + output_dir: str = "" # REQUIRED by aggregate.run + + +@dataclass(frozen=True, slots=True) +class ProjectFiles: + base_dir: str = "" # REQUIRED by aggregate.run + paths: tuple[str, ...] = () + + +@dataclass(frozen=True, slots=True) +class ProjectScreenshots: + base_dir: str = "." + paths: tuple[str, ...] = () + + +@dataclass(frozen=True, slots=True) +class ProjectDiscussion: + roles: tuple[str, ...] = () + history: tuple[str, ...] = () + + +@dataclass(frozen=True, slots=True) +class ProjectContext: + """Typed return type for project_manager.flat_config(). + Replaces the dict[str, Any] that flat_config() currently returns. + """ + project: ProjectMeta = field(default_factory=ProjectMeta) + output: ProjectOutput = field(default_factory=ProjectOutput) + files: ProjectFiles = field(default_factory=ProjectFiles) + screenshots: ProjectScreenshots = field(default_factory=ProjectScreenshots) + context_presets: Metadata = field(default_factory=dict) # opaque pass-through + discussion: ProjectDiscussion = field(default_factory=ProjectDiscussion) + + def to_dict(self) -> Metadata: + """Convert back to the dict shape for backward compat with consumers + that use .get() / [] (aggregate.run et al).""" + return { + "project": { + "name": self.project.name, + "summary_only": self.project.summary_only, + "execution_mode": self.project.execution_mode, + }, + "output": { + "namespace": self.output.namespace, + "output_dir": self.output.output_dir, + }, + "files": { + "base_dir": self.files.base_dir, + "paths": list(self.files.paths), + }, + "screenshots": { + "base_dir": self.screenshots.base_dir, + "paths": list(self.screenshots.paths), + }, + "context_presets": dict(self.context_presets), + "discussion": { + "roles": list(self.discussion.roles), + "history": list(self.discussion.history), + }, + } +``` + +Then `flat_config()` becomes: + +```python +def flat_config(proj: Metadata, disc_name: Optional[str] = None, track_id: Optional[str] = None) -> ProjectContext: + disc_sec = proj.get("discussion", {}) + if track_id: + history = load_track_history(track_id, proj.get("files", {}).get("base_dir", ".")) + else: + name = disc_name or disc_sec.get("active", "main") + disc_data = disc_sec.get("discussions", {}).get(name, {}) + history = disc_data.get("history", []) + return ProjectContext( + project=ProjectMeta( + name=proj.get("project", {}).get("name", ""), + summary_only=proj.get("project", {}).get("summary_only", False), + execution_mode=proj.get("project", {}).get("execution_mode", "standard"), + ), + output=ProjectOutput( + namespace=proj.get("output", {}).get("namespace", "project"), + output_dir=proj.get("output", {}).get("output_dir", ""), + ), + files=ProjectFiles( + base_dir=proj.get("files", {}).get("base_dir", ""), + paths=tuple(proj.get("files", {}).get("paths", [])), + ), + screenshots=ProjectScreenshots( + base_dir=proj.get("screenshots", {}).get("base_dir", "."), + paths=tuple(proj.get("screenshots", {}).get("paths", [])), + ), + context_presets=dict(proj.get("context_presets", {})), + discussion=ProjectDiscussion( + roles=tuple(disc_sec.get("roles", [])), + history=tuple(history), + ), + ) +``` + +--- + +## Migration strategy (consumer side) + +There are 8 consumer call sites of `flat_config()`: +- `src/aggregate.py:536` +- `src/api_hooks.py:173` +- `src/app_controller.py:4023, 4583, 4691, 4704, 4805` +- `src/gui_2.py:4456` +- `src/orchestrator_pm.py:133` + +Plus 2 test mocks: +- `tests/test_context_composition_decoupled.py:34` +- `tests/test_context_preview_button.py:65` + +**Two migration options** (Tier 2's choice): + +### Option A (incremental, recommended): Add `to_dict()` to ProjectContext, leave consumers unchanged + +The consumers use `.get()` and `[]` patterns on the dict. The dataclass's `to_dict()` produces the same shape. So: + +```python +# Before: +flat = project_manager.flat_config(proj) +namespace = flat.get("project", {}).get("name") or flat.get("output", {}).get("namespace", "project") + +# After (incremental): +flat = project_manager.flat_config(proj) +flat_dict = flat.to_dict() # unchanged consumer code uses flat_dict +namespace = flat_dict.get("project", {}).get("name") or flat_dict.get("output", {}).get("namespace", "project") +``` + +Then per-consumer migration: `flat = flat.to_dict()` → `flat = flat` (consumer directly uses the dataclass's `__getitem__`/`get` dict-compat methods — which already exist on the Metadata fat struct!) + +Wait — `ProjectContext` is NOT a Metadata. The dataclass does NOT have `__getitem__`/`get`. So consumers that do `flat.get(...)` would FAIL on the bare dataclass. + +**Fix:** give `ProjectContext` dict-compat methods too (or make it inherit from Metadata's pattern). But Metadata's `__getitem__` raises KeyError, and consumers use `.get()` with defaults. So `ProjectContext` needs `get()` and `__getitem__()`. + +```python +@dataclass(frozen=True, slots=True) +class ProjectContext: + # ... fields ... + + def __getitem__(self, key: str) -> Any: + return self.to_dict()[key] # always returns the dict + + def get(self, key: str, default: Any = None) -> Any: + return self.to_dict().get(key, default) + + def to_dict(self) -> Metadata: + # ... (as above) +``` + +This makes `flat.get(...)` work directly without `to_dict()` calls. Consumers migrate minimally: just remove the `.get(...)` → `flat_dict.get(...)` indirection. + +### Option B (full migration): Migrate all 10 consumer sites to use `flat.project.name`, `flat.output.output_dir`, etc. + +This is more thorough but touches 10 sites. Each consumer needs: +- Replace `flat.get("project", {}).get("name")` with `flat.project.name` +- Replace `flat["output"]["output_dir"]` with `flat.output.output_dir` +- Etc. + +Each migration is mechanical. Total work: ~40 lines across 10 files. Plus regression-guard tests. + +--- + +## Recommendation + +**Option A** (incremental, dict-compat) is faster and lower-risk. Phase 2 just adds the dataclasses + dict-compat methods + changes `flat_config` return type. Consumer migration is deferred to a follow-up. + +**Option B** is the "proper" fix (per the spec's spirit) but takes longer. Consumer migration touches the same files that the spec's other VCs touch (`aggregate.py`, `app_controller.py`, etc.). + +**Tier 2 should pick one and document the choice in the next track commit.** + +--- + +## Acceptance criteria (corrected Phase 2) + +After this correction is applied: + +| VC | Description | Verification | +|---|---|---| +| VC8 (corrected) | `flat_config` returns typed `ProjectContext` | `from src.models import ProjectContext; from src.project_manager import flat_config; from src.models import Metadata; proj = Metadata(); ctx = flat_config(proj); assert isinstance(ctx, ProjectContext)` | +| VC8 (corrected) | All 6 sub-dataclasses exist | `from src.models import ProjectMeta, ProjectOutput, ProjectFiles, ProjectScreenshots, ProjectDiscussion, ProjectContext; assert all 6 importable` | +| VC8 (corrected) | Consumers unchanged (Option A) | `tests/test_project_manager_*.py` all pass without modification | +| VC8 (corrected) | Dict-compat works | `ctx = flat_config(Metadata()); assert ctx.get("project") == {} # default empty; or matches proj.get("project"))` | +| VC8 (corrected) | `output_dir` REQUIRED field works | `flat_config(Metadata())` returns `ProjectContext` with `output.output_dir = ""` (the empty default); aggregate.run would fail with clear error when output_dir is empty (existing behavior, not a regression) | + +--- + +## File locations + +- `src/models.py` — add 6 new dataclasses (after existing dataclasses in the file) +- `src/project_manager.py` — change `flat_config` return type from `Metadata` to `ProjectContext` +- `src/aggregate.py` — NO CHANGE (Option A) or migrate to use sub-dataclass access (Option B) +- `tests/test_project_context_20260627.py` — NEW regression-guard test file with 8+ tests covering the dataclass + dict-compat methods + +--- + +## See also + +- `conductor/tracks/cruft_elimination_20260627/spec.md` — the original spec (Phase 2 section, lines ~95-120) +- `src/project_manager.py:268` — `flat_config()` actual definition +- `src/aggregate.py:484-525` — `aggregate.run()` consumer (the key reference for which fields are REQUIRED) +- `src/type_aliases.py` — the wire-format `Metadata` dataclass (similar pattern for dict-compat) +- `conductor/code_styleguides/data_oriented_design.md` — the "Prefer Fewer Types" principle