conductor(track): cruft_elimination_20260627 spec (final type-promotion track)
This commit is contained in:
@@ -0,0 +1,415 @@
|
||||
# Track Specification: c11_python_20260628
|
||||
|
||||
## Overview
|
||||
|
||||
**Goal:** Make Python behave as close to C11/Odin/Jai as possible within Python's runtime constraints. Eliminate all polymorphic dicts (`dict[str, Any]`), runtime type checks (`hasattr`, `isinstance` for entity dispatch), `Optional[T]` returns, `Any` type hints, and `.get('key', default)` access on known fields from internal code.
|
||||
|
||||
**Scope:** Promote every polymorphic dict to a typed dataclass (either a fat struct at the wire boundary OR a componentized dataclass at the specific path). Convert function signatures to declare typed parameters. Remove every `hasattr()` / `isinstance()` / `.get()` defensive check. Replace `Optional[T]` with `Result[T]` + `NIL_T` sentinels.
|
||||
|
||||
**After this track:**
|
||||
- One literal boundary layer (`tomllib.load()` + `json.loads()` result) uses `Metadata` (a typed fat struct).
|
||||
- Everywhere else: typed componentized dataclasses (already exist from `metadata_promotion_20260624`).
|
||||
- No `dict[str, Any]` outside the boundary layer.
|
||||
- No `hasattr()` for entity type dispatch.
|
||||
- No `Optional[T]` returns.
|
||||
- No `Any` type hints.
|
||||
- The 4.01e+22 metric drops because dispatcher functions lose their polymorphic branches.
|
||||
|
||||
## The C11/Odin/Jai Semantics in Python
|
||||
|
||||
| C11/Odin/Jai concept | Python equivalent | What it forbids |
|
||||
|---|---|---|
|
||||
| Value type (`struct`) | `@dataclass(frozen=True, slots=True)` | Mutation, dynamic field addition |
|
||||
| Static type (`int`, `string`) | type hint + mypy | `Any`, `dict[str, Any]` outside the boundary |
|
||||
| No null | `Result[T]` + `NIL_T` sentinel | `Optional[T]`, `None` returns |
|
||||
| Direct field access (`s.field`) | `s.field` | `.get('field', default)` on known fields |
|
||||
| No dynamic dispatch (`if hasfield`) | Compile-time-typed function params | `hasattr(x, 'field')` for entity type dispatch |
|
||||
| Explicit conversion at boundary | `from_dict()` at the wire entry | Scattered `from_dict()` in consumers |
|
||||
|
||||
## Current State Audit (after `type_alias_unfuck_20260626` ships)
|
||||
|
||||
| Cruft source | Current count | Source |
|
||||
|---|---:|---|
|
||||
| `Metadata: TypeAlias = dict[str, Any]` (the lazy-typing escape hatch) | 1 | `src/type_aliases.py:6` |
|
||||
| `.get('key', default)` sites on known aggregates | ~15 (post-unfuck) | `git grep -cE "\.get\('[a-z_]+'," -- 'src/*.py'` |
|
||||
| `hasattr(f, 'path')` defensive checks | ~10 | `git grep -E "hasattr\(f, 'path'\)" -- 'src/*.py'` |
|
||||
| `hasattr(self, 'attr')` lazy-init checks | ~20 | `git grep -E "hasattr\(self," -- 'src/*.py'` |
|
||||
| Function signatures with `Metadata` parameter | ~30+ | `git grep -cE "def .+\(.*: Metadata" -- 'src/*.py'` |
|
||||
| Function signatures with `Any` parameter | ~15+ | `git grep -cE "def .+\(.*: Any" -- 'src/*.py'` |
|
||||
| Function signatures with `dict\[str, Any\]` parameter | ~20+ | `git grep -cE "def .+\(.*: dict\[str, Any\]" -- 'src/*.py'` |
|
||||
| `Optional[T]` return types | ~25+ | `git grep -cE "-> Optional\[" -- 'src/*.py'` |
|
||||
| `Any` return types | ~10+ | `git grep -cE "-> Any" -- 'src/*.py'` |
|
||||
| Effective codepaths | 4.014e+22 | baseline |
|
||||
|
||||
## Goals
|
||||
|
||||
| ID | Goal | Acceptance |
|
||||
|---|---|---|
|
||||
| G1 | `Metadata` becomes `@dataclass(frozen=True, slots=True)` (typed fat struct) | `src/type_aliases.py` shows `Metadata` as a dataclass, NOT `TypeAlias = dict[str, Any]` |
|
||||
| G2 | Zero `Metadata: TypeAlias = dict[str, Any]` | The TypeAlias is removed; only the dataclass remains |
|
||||
| G3 | Zero `dict[str, Any]` parameter types in internal code | `git grep -cE "def .+\(.*: dict\[str, Any\]" -- 'src/app_controller.py' 'src/gui_2.py' 'src/aggregate.py' 'src/multi_agent_conductor.py' 'src/mcp_client.py' 'src/ai_client.py' 'src/rag_engine.py' 'src/models.py'` returns 0 |
|
||||
| G4 | Zero `Any` parameter types in internal code | Same grep with `: Any` returns 0 |
|
||||
| G5 | Zero `Optional[T]` return types | `git grep -cE "-> Optional\[" -- 'src/*.py'` returns 0 |
|
||||
| G6 | Zero `hasattr(f, ...)` entity dispatch checks | `git grep -cE "hasattr\(f, '(path\|source_tier\|content\|role\|model\|id\|status)'\)" -- 'src/*.py'` returns 0 |
|
||||
| G7 | `self.files` is ALWAYS `List[FileItem]` (no dicts in the list) | The append paths convert dicts via `models.FileItem.from_dict(p)`; the `hasattr(f, 'path')` checks are removed |
|
||||
| G8 | `flat_config` returns `ProjectContext` (typed), not `dict` | New `ProjectContext` dataclass; `project_manager.flat_config()` returns it |
|
||||
| G9 | `rag_engine.search()` returns `List[RAGChunk]` (typed), not `List[Dict]` | Return type changed; 3 consumers updated |
|
||||
| G10 | `_do_generate` returns `list[FileItem]` (typed), not `list[Metadata]` | Return type annotation fixed |
|
||||
| G11 | All 7 audit gates pass `--strict` | All exit 0 |
|
||||
| G12 | All existing tests pass | `scripts/run_tests_batched.py` → 10/11 |
|
||||
| G13 | Effective codepaths drops by ≥ 4 orders of magnitude | `< 1e+18` (was 4.014e+22) |
|
||||
| G14 | The boundary layer is documented as exactly 2 places: TOML load + JSON parse | `docs/reports/boundary_layer_20260628.md` enumerates every `Metadata` usage with justification |
|
||||
|
||||
## Non-Goals
|
||||
|
||||
- Modifying the existing 12 per-aggregate dataclass definitions (their fields are correct; just need to USE them)
|
||||
- Adding new `src/<thing>.py` files
|
||||
- Creating further followup tracks (this is the FINAL track; no more layers)
|
||||
- Changing the runtime semantics of Python (we're working within Python's constraints)
|
||||
|
||||
## Functional Requirements
|
||||
|
||||
### FR1: The Boundary Layer is EXACTLY 2 places
|
||||
|
||||
**Place 1: TOML config loaders** in `src/project_manager.py`, `src/preset*.py`, `src/personas.py`, `src/tool_presets.py`, `src/context_presets.py`, `src/workspace_manager.py`.
|
||||
|
||||
The TOML loader returns `Metadata` (the typed fat struct) for the 100ns between `tomllib.load()` and the caller's `from_dict()` conversion. Every consumer of the TOML loader immediately does `ProjectContext.from_dict(loaded)`, `Persona.from_dict(loaded)`, etc.
|
||||
|
||||
**Place 2: JSON wire parsers** in `src/api_hooks.py` (HTTP entry points) and `src/mcp_client.py` (MCP wire protocol).
|
||||
|
||||
The JSON parser returns `Metadata` for the 100ns between `json.loads()` and the caller's `from_dict()` conversion. Every consumer immediately does `ChatMessage.from_dict(payload)`, `MMAUsageStats.from_dict(payload)`, etc.
|
||||
|
||||
**No other code uses `Metadata`.** Every other function takes a typed componentized dataclass.
|
||||
|
||||
### FR2: `Metadata` becomes a typed fat struct
|
||||
|
||||
```python
|
||||
# In src/type_aliases.py:
|
||||
@dataclass(frozen=True, slots=True)
|
||||
class Metadata:
|
||||
"""The wire-format boundary type. ONLY used in TOML loaders and JSON parsers.
|
||||
Internal code uses componentized dataclasses (CommsLogEntry, FileItem, etc.)."""
|
||||
# TOML keys
|
||||
paths: Metadata = field(default_factory=dict) # nested dict for path config
|
||||
project: Metadata = field(default_factory=dict)
|
||||
discussion: Metadata = field(default_factory=dict)
|
||||
# JSON wire keys (per-vendor chat message)
|
||||
role: str = ""
|
||||
content: Any = None
|
||||
tool_calls: Metadata = field(default_factory=list)
|
||||
tool_call_id: str = ""
|
||||
name: str = ""
|
||||
# Session log keys
|
||||
ts: str = ""
|
||||
kind: str = ""
|
||||
direction: str = ""
|
||||
model: str = "unknown"
|
||||
source_tier: str = "main"
|
||||
error: str = ""
|
||||
# MMA ticket keys
|
||||
id: str = ""
|
||||
description: str = ""
|
||||
status: str = "todo"
|
||||
depends_on: tuple = ()
|
||||
manual_block: bool = False
|
||||
# RAG result keys
|
||||
document: str = ""
|
||||
score: float = 0.0
|
||||
# Tool keys
|
||||
function: Metadata = field(default_factory=dict)
|
||||
args: Metadata = field(default_factory=dict)
|
||||
script: str = ""
|
||||
output: str = ""
|
||||
type: str = ""
|
||||
# Tool definition keys
|
||||
description: str = ""
|
||||
parameters: Metadata = field(default_factory=dict)
|
||||
auto_start: bool = False
|
||||
# File item keys
|
||||
path: str = ""
|
||||
view_mode: str = "full"
|
||||
custom_slices: Metadata = field(default_factory=list)
|
||||
# Token usage keys
|
||||
input_tokens: int = 0
|
||||
output_tokens: int = 0
|
||||
cache_read_input_tokens: int = 0
|
||||
cache_creation_input_tokens: int = 0
|
||||
# Generic pass-through
|
||||
metadata: Metadata = field(default_factory=dict)
|
||||
|
||||
def to_dict(self) -> Metadata:
|
||||
return {f.name: v for f in fields(self) for v in [getattr(self, f.name)] if v not in (None, "", [], {}, 0, 0.0, False) or f.name in _NON_NULL_FIELDS}
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, raw: dict[str, Any]) -> "Metadata":
|
||||
valid = {f.name for f in fields(cls)}
|
||||
return cls(**{k: v for k, v in raw.items() if k in valid})
|
||||
```
|
||||
|
||||
**Why a fat struct here is OK:** the wire format (TOML/JSON) is polymorphic at the boundary. The boundary function receives arbitrary keys. After the boundary, internal code uses componentized types. The fat struct is the WIRE schema; not a lazy-typing escape hatch.
|
||||
|
||||
### FR3: Componentize the specific paths (already exist)
|
||||
|
||||
The 12 dataclasses already exist from `metadata_promotion_20260624`:
|
||||
|
||||
| Dataclass | Used at | Replaces |
|
||||
|---|---|---|
|
||||
| `CommsLogEntry` | session log entries, MMA telemetry | `entry_obj = {...}` dict literals |
|
||||
| `HistoryMessage` | UI discussion history | `msg.get('role', 'unknown')` etc. |
|
||||
| `FileItem` | context composition | `flat.get('files', {}).get('paths', [])` |
|
||||
| `ToolCall` | tool loop | `tc.get('id')` / `tc['function']['name']` |
|
||||
| `ChatMessage` | provider-side history | `msg.get('role')` in send paths |
|
||||
| `UsageStats` | token usage | `u.get('input_tokens', 0)` |
|
||||
| `RAGChunk` | RAG results | `chunk.get('document', '')` |
|
||||
| `Ticket` | MMA tickets | `t.get('id', '')` / `t['depends_on']` |
|
||||
| `SessionInsights` | session stats | `insights.get('total_tokens', 0)` |
|
||||
| `DiscussionSettings` | per-turn settings | `entry.get('temperature', 0.7)` |
|
||||
| `CustomSlice` | visual slices | `slc.get('tag', '')` / `slc['start_line']` |
|
||||
| `MMAUsageStats` | per-tier usage | `stats.get('model', 'unknown')` |
|
||||
| `ProviderPayload` | script execution | `payload.get('script')` |
|
||||
| `UIPanelConfig` | panel state | `gui_cfg.get('separate_message_panel', False)` |
|
||||
| `PathInfo` | path config | `proj_paths['logs_dir']` |
|
||||
| `ToolDefinition` | tool schemas | `tinfo.get('description', '')` |
|
||||
|
||||
**Usage rule:** at each specific path, the variable is declared as the typed dataclass. Direct attribute access. No `.get()`.
|
||||
|
||||
### FR4: Fix the central path bugs
|
||||
|
||||
These bugs are the source of the defensive checks:
|
||||
|
||||
| File:line | Bug | Fix |
|
||||
|---|---|---|
|
||||
| `src/app_controller.py:1101` | `self.files: List[models.FileItem] = []` (declared) but `app_controller.py:1999-2003` appends dicts | At the append site, convert dicts via `models.FileItem.from_dict(p)`; the list is truly `List[FileItem]` |
|
||||
| `src/app_controller.py:4006` | `_do_generate(self) -> tuple[str, Path, list[Metadata], ...]` (return type wrong; actual is `list[FileItem]`) | Change return type to `list[FileItem]`; update `gui_2.py` callers |
|
||||
| `src/project_manager.py:flat_config` | returns `dict[str, Any]` | Return `ProjectContext` (new dataclass) |
|
||||
| `src/aggregate.py:96` | `f.path if hasattr(f, 'path') else str(f)` (defensive for f might be dict) | `f` is now `FileItem`; `f.path` direct |
|
||||
| `src/aggregate.py:193` | `elif hasattr(entry_raw, "path")` (defensive for entry_raw might be dict) | `entry_raw` is `FileItem`; `entry_raw.path` direct |
|
||||
| `src/aggregate.py:3259` | `chunk.get('document', '')` (RAG chunk is dict) | `chunk` is `RAGChunk`; `chunk.document` direct |
|
||||
| `src/rag_engine.py:367` | `search() -> List[Dict[str, Any]]` (return type wrong) | Return `List[RAGChunk]` |
|
||||
| `src/app_controller.py:263` | `[f.path if hasattr(f, "path") else f.get("path") ...]` | `f` is `FileItem`; `f.path` direct |
|
||||
| `src/app_controller.py:1767` | same | same |
|
||||
| `src/app_controller.py:1771` | same | same |
|
||||
| `src/app_controller.py:2536` | same | same |
|
||||
| `src/app_controller.py:3129` | same | same |
|
||||
| `src/app_controller.py:3182` | same | same |
|
||||
| `src/app_controller.py:2274` | `payload.get('script') or json.dumps(payload.get('args', {}), indent=1)` | `payload` is `ProviderPayload`; `payload.script or json.dumps(payload.args, indent=1)` |
|
||||
|
||||
After these fixes, `git grep -cE "hasattr\(f," -- 'src/*.py'` returns 0.
|
||||
|
||||
### FR5: Eliminate `Optional[T]` returns
|
||||
|
||||
Per `conductor/code_styleguides/error_handling.md`:
|
||||
|
||||
```python
|
||||
# BAD:
|
||||
def find_ticket(id: str) -> Optional[Ticket]:
|
||||
...
|
||||
|
||||
# GOOD (Result pattern):
|
||||
def find_ticket(id: str) -> Result[Ticket]:
|
||||
return Result(data=NIL_TICKET) if not found else Result(data=ticket)
|
||||
|
||||
# BETTER (NIL sentinel):
|
||||
def find_ticket(id: str) -> Ticket:
|
||||
...
|
||||
return NIL_TICKET # zero-initialized frozen dataclass; safe to read fields
|
||||
```
|
||||
|
||||
`NIL_TICKET` is a module-level singleton: `NIL_TICKET = Ticket(id="", description="", status="missing", manual_block=False)`. Consumers can read `ticket.id`, `ticket.status`, etc. safely — no `None` check needed.
|
||||
|
||||
### FR6: Eliminate `Any` and `dict[str, Any]` from internal function signatures
|
||||
|
||||
```python
|
||||
# BAD:
|
||||
def _to_typed_tool_call(tc: Any) -> ToolCall:
|
||||
return ToolCall(id=getattr(tc, "id", "") or "", ...)
|
||||
|
||||
# GOOD (boundary function):
|
||||
def _parse_wire_tool_call(wire: dict[str, Any]) -> ToolCall:
|
||||
"""Boundary: parse MCP wire-format dict to typed ToolCall. ONLY called from src/openai_compatible.py."""
|
||||
return ToolCall.from_dict(wire)
|
||||
|
||||
# INTERNAL function (already typed):
|
||||
def process_tool_call(tc: ToolCall) -> None:
|
||||
tool_id = tc.id # no getattr; the type is guaranteed
|
||||
```
|
||||
|
||||
After this, every function signature in `src/app_controller.py`, `src/gui_2.py`, `src/aggregate.py`, `src/multi_agent_conductor.py`, `src/mcp_client.py` (internal functions only), `src/ai_client.py` (send methods only — boundary), `src/rag_engine.py`, `src/models.py` declares typed dataclasses (no `Any`, no `dict[str, Any]`).
|
||||
|
||||
### FR7: The lazy-init `hasattr(self, ...)` pattern is allowed
|
||||
|
||||
The `hasattr(self, 'perf_monitor')` checks in `src/app_controller.py` are NOT entity dispatch — they're lazy initialization. These stay (they're internal state management, not external type dispatch).
|
||||
|
||||
But document: per `conductor/code_styleguides/python.md`, lazy init is acceptable. The DOD rule is "no runtime type dispatch for entity types" — lazy init is initialization state, not entity type.
|
||||
|
||||
## Per-Phase Task List
|
||||
|
||||
### Phase 0: Promote `Metadata` to typed fat struct (FR2)
|
||||
|
||||
```bash
|
||||
# Read src/type_aliases.py current state
|
||||
# Write the new Metadata dataclass with all 30+ fields
|
||||
# Remove the TypeAlias
|
||||
# Verify: from src.type_aliases import Metadata; Metadata(role='user', content='hi')
|
||||
# Verify: Metadata.from_dict({'role': 'user'}) works
|
||||
```
|
||||
|
||||
### Phase 1: Add new typed `ProjectContext` dataclass
|
||||
|
||||
```bash
|
||||
# Add ProjectContext to src/models.py with all fields observed in src/project_manager.py:flat_config
|
||||
# Convert flat_config to return ProjectContext
|
||||
# Update consumers (src/app_controller.py:_do_generate, src/gui_2.py)
|
||||
```
|
||||
|
||||
### Phase 2: Fix `self.files` in `src/app_controller.py` (FR4 row 1)
|
||||
|
||||
```bash
|
||||
# At src/app_controller.py:1996-2003, replace the 3-line append with:
|
||||
# for p in paths:
|
||||
# if isinstance(p, dict):
|
||||
# self.files.append(models.FileItem.from_dict(p))
|
||||
# elif isinstance(p, str):
|
||||
# self.files.append(models.FileItem(path=p))
|
||||
# elif isinstance(p, models.FileItem):
|
||||
# self.files.append(p)
|
||||
# else:
|
||||
# raise TypeError(f"unexpected file item type: {type(p)}")
|
||||
# Remove all hashr(f, 'path') checks at: 263, 1767, 1771, 2536, 3129, 3182
|
||||
```
|
||||
|
||||
### Phase 3: Fix `_do_generate` return type (FR4 row 2)
|
||||
|
||||
```bash
|
||||
# Change src/app_controller.py:4006 from `list[Metadata]` to `list[FileItem]`
|
||||
# Update src/gui_2.py callers (search for `_do_generate(` and verify the receiver is typed as list[FileItem])
|
||||
```
|
||||
|
||||
### Phase 4: Fix `rag_engine.search()` return type (FR4 row 7)
|
||||
|
||||
```bash
|
||||
# Change src/rag_engine.py:367 from `List[Dict[str, Any]]` to `List[RAGChunk]`
|
||||
# Update src/aggregate.py:3259, src/app_controller.py:251, src/app_controller.py:4162 to use chunk.document directly
|
||||
# Handle the wire format mismatch (RAGChunk expects path top-level; wire has metadata.path)
|
||||
```
|
||||
|
||||
### Phase 5: Fix all `entry_obj = {...}` dict literals in `src/app_controller.py` (FR4 row 14)
|
||||
|
||||
```bash
|
||||
# At src/app_controller.py:2274, replace `payload.get('script') or json.dumps(payload.get('args', {}), indent=1)` with `pp = ProviderPayload.from_dict(payload); pp.script or json.dumps(pp.args, indent=1)`
|
||||
# Same for lines 2277, 2287, 2305-2308 (already partly done)
|
||||
# Same for lines 3508 (`f['path'] for f in file_items` → `f.path for f in file_items` since f is now FileItem)
|
||||
```
|
||||
|
||||
### Phase 6: Fix `src/aggregate.py` defensive checks (FR4 rows 5-6)
|
||||
|
||||
```bash
|
||||
# At src/aggregate.py:96, replace `f.path if hasattr(f, 'path') else str(f)` with `f.path` (f is FileItem)
|
||||
# At src/aggregate.py:193, replace `elif hasattr(entry_raw, "path")` with `elif isinstance(entry_raw, FileItem): entry_raw.path`
|
||||
# At src/aggregate.py:3259, replace `chunk.get('document', '')` with `chunk.document` (chunk is RAGChunk)
|
||||
```
|
||||
|
||||
### Phase 7: Eliminate `Optional[T]` returns (FR5)
|
||||
|
||||
```bash
|
||||
# For each `Optional[T]` return in src/, replace with `Result[T]` or `NIL_T` sentinel
|
||||
# Define NIL_TICKET, NIL_COMMS_LOG_ENTRY, etc. in src/type_aliases.py
|
||||
# Update consumers to handle NIL_T (read fields directly; NIL_T is zero-initialized)
|
||||
```
|
||||
|
||||
### Phase 8: Eliminate `Any` and `dict[str, Any]` from internal signatures (FR6)
|
||||
|
||||
```bash
|
||||
# For each function signature with `Any` or `dict[str, Any]` parameter in internal files, change to the typed dataclass
|
||||
# For boundary functions (TOML/JSON parsers), keep `dict[str, Any]` but document with a comment that it's a boundary
|
||||
```
|
||||
|
||||
### Phase 9: Re-measure + verification
|
||||
|
||||
```bash
|
||||
# Cruft counts all 0
|
||||
git grep -cE "\.get\('[a-z_]+'," -- 'src/*.py' # expect: < 15 (only collapsed-codepath)
|
||||
git grep -cE "hasattr\(f, '(path|source_tier|content|role|model|id|status)'\)" -- 'src/*.py' # expect: 0
|
||||
git grep -cE "def .+\(.*: (Metadata|Any|dict\[str, Any\])" -- 'src/app_controller.py' 'src/gui_2.py' 'src/aggregate.py' 'src/multi_agent_conductor.py' 'src/mcp_client.py' 'src/ai_client.py' 'src/rag_engine.py' 'src/models.py' # expect: 0
|
||||
git grep -cE "-> Optional\[" -- 'src/*.py' # expect: 0
|
||||
git grep -cE "-> Any" -- 'src/*.py' # expect: 0
|
||||
|
||||
# Effective codepaths
|
||||
uv run python -c "..." # expect: < 1e+18
|
||||
|
||||
# 7 audit gates
|
||||
uv run python scripts/audit_weak_types.py --strict
|
||||
uv run python scripts/generate_type_registry.py --check
|
||||
# etc.
|
||||
|
||||
# Batched tests
|
||||
uv run python scripts/run_tests_batched.py # expect: 10/11 PASS
|
||||
```
|
||||
|
||||
### Phase 10: Boundary layer audit + documentation
|
||||
|
||||
```bash
|
||||
# Document every Metadata usage with justification
|
||||
git grep -nE "Metadata" -- 'src/*.py' > /tmp/metadata_usages.txt
|
||||
|
||||
# Write docs/reports/boundary_layer_20260628.md
|
||||
# Enumerate every Metadata usage; classify as boundary (kept) or internal (must fix)
|
||||
# Expect: only the TOML loaders + JSON parsers retain Metadata
|
||||
```
|
||||
|
||||
## Acceptance Criteria (Definition of Done)
|
||||
|
||||
| # | Criterion | Verification |
|
||||
|---|---|---|
|
||||
| VC1 | `Metadata` is a `@dataclass(frozen=True, slots=True)` with explicit fields | `git grep -A 1 "^class Metadata" src/type_aliases.py` shows `@dataclass(frozen=True, slots=True)` |
|
||||
| VC2 | No `TypeAlias = dict[str, Any]` for Metadata | `git grep "^Metadata: TypeAlias" src/type_aliases.py` returns nothing |
|
||||
| VC3 | Zero `dict[str, Any]` parameter types in internal files | grep returns 0 |
|
||||
| VC4 | Zero `Any` parameter types in internal files | grep returns 0 |
|
||||
| VC5 | Zero `Optional[T]` return types | grep returns 0 |
|
||||
| VC6 | Zero `hasattr(f, ...)` entity dispatch checks | grep returns 0 |
|
||||
| VC7 | `self.files` is always `List[FileItem]` | `git grep -E "self\.files\.append\(" -- 'src/app_controller.py'` shows ONLY FileItem appends |
|
||||
| VC8 | `flat_config` returns typed `ProjectContext` | New dataclass exists; return type fixed |
|
||||
| VC9 | `rag_engine.search()` returns `List[RAGChunk]` | Return type fixed; 3 consumers updated |
|
||||
| VC10 | All 7 audit gates pass | All exit 0 |
|
||||
| VC11 | 10/11 batched test tiers PASS | `scripts/run_tests_batched.py` → 10/11 |
|
||||
| VC12 | Effective codepaths < 1e+18 | 4+ orders of magnitude drop |
|
||||
| VC13 | Boundary layer audit written | `docs/reports/boundary_layer_20260628.md` exists |
|
||||
| VC14 | The 12 per-aggregate dataclasses used at their specific paths | grep shows direct attribute access everywhere |
|
||||
|
||||
## Why this is the FINAL track (no more followups)
|
||||
|
||||
After this track:
|
||||
|
||||
1. **`Metadata` is a typed fat struct**, used ONLY at the literal TOML/JSON boundary (2 places in the entire codebase).
|
||||
2. **Every internal function takes a typed dataclass** — no `Any`, no `dict[str, Any]`.
|
||||
3. **No runtime type dispatch** — no `hasattr()` for entity type checks, no `isinstance()` for entity dispatch.
|
||||
4. **No null** — `Result[T]` + `NIL_T` sentinels per `error_handling.md`.
|
||||
5. **No `.get()` on known fields** — direct attribute access.
|
||||
6. **The metric drops by 4+ orders of magnitude** because dispatcher functions lose their polymorphic branches.
|
||||
|
||||
The conventions are ENFORCED:
|
||||
- Every new function signature MUST declare typed parameters (no `Any`).
|
||||
- Every new dataclass goes in `src/type_aliases.py` (type-system) or the appropriate parent module (in-module).
|
||||
- Every wire boundary (TOML/JSON parse) is the ONLY place `Metadata` (the typed fat struct) appears.
|
||||
- Every consumer of a wire boundary IMMEDIATELY converts to a componentized dataclass via `from_dict()`.
|
||||
|
||||
Future code that wants to receive raw data MUST:
|
||||
- Add a `from_dict()` classmethod to the appropriate dataclass (or create a new one)
|
||||
- Convert at the wire boundary
|
||||
- Internal code only sees the typed dataclass
|
||||
|
||||
This is C11/Odin/Jai semantics in Python. As fast as Python can be.
|
||||
|
||||
## See also
|
||||
|
||||
- `conductor/code_styleguides/data_oriented_design.md` — the canonical DOD reference (Mike Acton, Ryan Fleury, Casey Muratori)
|
||||
- `conductor/code_styleguides/error_handling.md` — `Result[T]` + `NIL_T` convention
|
||||
- `conductor/code_styleguides/type_aliases.md` §2.5 — the per-aggregate dataclass rule
|
||||
- `docs/reports/FOLLOWUP_metadata_promotion_20260624.md` — the prior Tier 1 review (the root cause analysis)
|
||||
- `conductor/tracks/metadata_promotion_20260624/spec.md` — the track that added the 12 componentized dataclasses
|
||||
- `conductor/tracks/type_alias_unfuck_20260626/spec.md` — the track that migrated the consumer sites (with the `isinstance` cruft this track removes)
|
||||
- `src/type_aliases.py` — the boundary type (`Metadata`) and the 12 componentized dataclasses
|
||||
- `src/models.py:533` — `FileItem` (canonical in-module dataclass)
|
||||
- `src/models.py:302` — `Ticket` (canonical in-module dataclass)
|
||||
- `src/openai_schemas.py` — `ToolCall`, `ChatMessage`, `UsageStats` (canonical provider-side dataclasses)
|
||||
- `conductor/AGENTS.md` — hard bans (NEVER use `git restore`, `git checkout --`, `git reset`, `git revert`)
|
||||
Reference in New Issue
Block a user