Private
Public Access
0
0
Files

30 KiB

Track: Data Structure Strengthening (Type Aliases + NamedTuples)

Status: Active (spec approved 2026-06-06) Initialized: 2026-06-06 Owner: Tier 2 Tech Lead Priority: Medium (developer + AI-readability; not a regression blocker)


1. Overview

This track introduces a small, focused set of TypeAlias definitions in a new src/type_aliases.py module and replaces 370+ anonymous dict[str, Any] / list[dict[...]] usages across 6 high-traffic files (src/ai_client.py, src/app_controller.py, src/models.py, src/api_hook_client.py, src/project_manager.py, src/aggregate.py). It also converts 2-3 tuple returns to NamedTuples for self-documenting struct semantics.

In addition, the track introduces a new docs/type_registry/ directory that contains auto-generated documentation describing the fields of every TypeAlias, NamedTuple, @dataclass, and TypedDict in src/. A new script scripts/generate_type_registry.py reads src/ via AST and writes the docs. The coding agent runs this script as part of track completion (and CI runs it as a --check to detect drift).

The track is data-grounded: a new AST-based audit script (scripts/audit_weak_types.py, committed in 84fd9ac9) found 430 weak type sites across 29 of 61 files. After whitespace normalization, only 26 unique type strings exist; the top 4 (list[dict[str, Any]], dict[str, Any], Dict[str, Any], List[Dict[str, Any]]) account for 86% of findings. A small set of well-named aliases eliminates the vast majority.

The current codebase has ZERO strong type aliases (no TypeAlias, no NamedTuple, no pydantic.BaseModel for these shapes). This is the worst case for AI readability — an LLM reading the code has zero schema hints and must guess the shape from usage at every call site.

Scope is deliberately bounded. The track adds 6 type aliases, converts 2-3 tuple returns to NamedTuples, and introduces the type registry generator + initial generated docs. It does NOT migrate to TypedDict or @dataclass schemas (the registry generator captures the field information in docs form, with much lower upfront cost). It does NOT touch the 23 lower-impact files; they remain as dict[str, Any] until a future track migrates them.

1.1 Why docs over TypedDict

The original draft of this spec proposed a follow-up track "TypedDict / dataclass Migration" that would convert every Metadata alias into a TypedDict with explicit fields. After user feedback, this was replaced with the type-registry approach for three reasons:

  1. Lower upfront cost. TypedDict requires designing the schema for every type. The registry generator reads what already exists in code and writes it to docs. No schema design needed.
  2. Better fit for AI workflow. An LLM that needs to know the fields of CommsLogEntry can cat docs/type_registry/ai_client.md once, then use the field info. The cost is a few hundred tokens of context, paid only when the LLM needs the schema.
  3. Auto-maintained. The script runs as part of track completion and as a CI --check. The registry can never drift; if code changes, the agent regenerates the docs.

The "cost we eat" is the LLM reading the docs at query time. This is bounded (a few hundred tokens per query) and proportional to the actual information need.

2. Goals (Priority Order)

Priority Goal Rationale
A (primary value) Add 6 TypeAlias definitions to src/type_aliases.py: Metadata, CommsLogEntry, CommsLog, FileItem, FileItems, HistoryMessage. Each alias names a concept that currently appears as dict[str, Any] or list[dict[str, Any]] in 30+ sites. The name is self-documenting; the underlying type is the same.
A (primary value) Mechanical replacement of 370+ weak sites in 6 files: src/ai_client.py, src/app_controller.py, src/models.py, src/api_hook_client.py, src/project_manager.py, src/aggregate.py. The audit shows 86% of findings are in these 6 files. A focused refactor here eliminates the bulk of the noise.
B (architectural) The new aliases are the canonical names going forward. New code MUST use the aliases. Old code is migrated opportunistically (this track + future tracks). One source of truth. The audit script (scripts/audit_weak_types.py) becomes a permanent CI gate that fails when new weak types are introduced.
B (architectural) Audit script exits 0 with significantly fewer findings after the refactor. Re-running --json should show the count drop from 430 to ~60 (only the 23 lower-impact files remain). Measurable success criterion. The audit script is the ground truth.
C (optimization) Convert 2-3 tuple returns to NamedTuples. Specifically: _reread_file_items() returns Tuple[refreshed, changed] becomes a FileItemsDiff NamedTuple. Other 1-occurrence tuples (screen coords, etc.) are converted opportunistically. The tuple return pattern is rarer than the dict pattern (4 sites vs 430), but each conversion is high-value for self-documentation.
C (documentation) Add a short "Data Structure Conventions" section to conductor/product-guidelines.md and a new conductor/code_styleguides/type_aliases.md reference. The convention is visible in the project-level guidance. Future plans reference it.
C (innovation) New docs/type_registry/ directory with auto-generated documentation describing the fields of every TypeAlias, NamedTuple, @dataclass, and TypedDict in src/. New script scripts/generate_type_registry.py reads src/ via AST and writes the docs. The script has a --check mode for CI: exits 1 if the registry would change. The coding agent runs the script as part of track completion. The "docs over TypedDict" tradeoff: pay a small token cost at AI-query time (the LLM cats the docs) instead of a large upfront cost (designing TypedDict schemas for every type). See §1.1.
D (forward-looking) Plan a future "Registry Maintenance" track that promotes the type-registry generation to a CI gate (fail if --check reports drift). The registry becomes part of every track's commit workflow. NOT in this track; documented in §12.1. The track ships the registry; the future track wires it into CI / track-completion workflows.

2.1 Non-Goals (this track)

  • Not converting dict[str, Any] to TypedDict or @dataclass directly in code. The type registry (added in Phase 2) captures the field information in docs form; a future track may convert the most-used aliases to TypedDict (giving schema hints via type hints instead of via docs), but that is a separate decision.
  • Not touching the 23 lower-impact files. They stay as dict[str, Any] until a future incremental track migrates them. The audit script makes their weakness VISIBLE so the cost of ignoring them is documented.
  • Not changing the Result[T] pattern from the data_oriented_error_handling_20260606 track. The aliases complement Result; they don't replace it. (ErrorInfo is a @dataclass, not a TypeAlias; it's already structured.)
  • Not adding pydantic models. The project doesn't currently use pydantic for these shapes; introducing it would be a much larger architectural decision.
  • Not modifying the data_oriented_error_handling_20260606 track's src/result_types.py. The aliases live in a new file (src/type_aliases.py); they coexist with Result/ErrorInfo.
  • Not changing the public API of any function. The aliases are TYPE-LEVEL ONLY; runtime behavior is identical.

3. Architecture

3.1 The Aliases

src/type_aliases.py (NEW, ~80 lines):

from typing import Any, Callable, TypeAlias

# A single key-value record. The shape is intentionally open (Any value type)
# because different concepts use different value types (str for paths, int for
# counts, dict for nested structures, etc.). The name documents the SEMANTIC
# ROLE, not the structural shape.
Metadata: TypeAlias = dict[str, Any]

# A single entry in the AI comms log (the in-memory ring buffer of API
# requests/responses/timestamps/kind/direction). Used by _comms_log,
# _append_comms, get_comms_log, comms_log_callback, etc.
CommsLogEntry: TypeAlias = Metadata

# A list of comms log entries.
CommsLog: TypeAlias = list[CommsLogEntry]

# A single entry in the AI provider's conversation history (the messages
# list passed to/from OpenAI/Anthropic/Gemini). Used by _anthropic_history,
# _deepseek_history, _minimax_history, _grok_history, _llama_history, etc.
HistoryMessage: TypeAlias = Metadata

# A list of history messages.
History: TypeAlias = list[HistoryMessage]

# A single file item in the context (path, content, is_image flag, base64
# data, mtime). Used by file_items parameter (the most-threated list in
# the codebase), _reread_file_items, _build_file_context_text, etc.
FileItem: TypeAlias = Metadata

# A list of file items. The most common weak pattern in the codebase.
FileItems: TypeAlias = list[FileItem]

# A single tool definition (function name, description, parameters schema).
# Used by _build_anthropic_tools, _CACHED_ANTHROPIC_TOOLS, _get_anthropic_tools,
# and the corresponding openai-compatible / gemini / deepseek builders.
ToolDefinition: TypeAlias = Metadata

# A single tool call from the model (id, type, function: {name, arguments}).
# Used by response.tool_calls parsing across all providers.
ToolCall: TypeAlias = Metadata

# A callback that receives a comms log entry. Used by comms_log_callback,
# confirm_and_run_callback, etc.
CommsLogCallback: TypeAlias = Callable[[CommsLogEntry], None]

3.2 The NamedTuples (Phase 2)

src/type_aliases.py (continued):

from typing import NamedTuple

# Return type of _reread_file_items. The two lists are conceptually distinct:
# refreshed = items whose mtime was checked and the content re-read; changed =
# items whose content actually changed (subset of refreshed).
class FileItemsDiff(NamedTuple):
 refreshed: FileItems
 changed: FileItems

(Optional, if 1-2 more tuple returns warrant conversion — e.g., Optional[Tuple[int, int, int, int]] for screen coords, etc. — add them as separate NamedTuples with semantic names.)

3.3 Why These Specific Aliases

The 6 aliases were chosen to be concept-distinct: each names a different semantic role that the code uses. Using the same name (Metadata) for all of them would collapse the semantic distinction; using 30 names would exceed the AI's vocabulary budget. 6 is the sweet spot:

Alias Semantic role Distinct from
Metadata generic key-value record (root)
CommsLogEntry a single comms log entry HistoryMessage (different lifecycle)
HistoryMessage a single AI provider history message CommsLogEntry (different lifecycle)
FileItem a single file in the context ToolDefinition (different shape: paths vs function specs)
ToolDefinition a single tool definition FileItem, ToolCall
ToolCall a single tool call from the model ToolDefinition (definition vs invocation)

Some of these are aliased to Metadata (e.g., CommsLogEntry: TypeAlias = Metadata). This is intentional: Phase 2 can convert Metadata to a TypedDict (or split into per-concept TypedDicts) and the aliases continue to work without breaking changes. The aliases are STABLE NAMES; the underlying type can evolve.

3.4 Module Layout

src/
  type_aliases.py              # NEW: 6 TypeAliases + 1-3 NamedTuples
  ai_client.py                 # MODIFIED: import aliases; replace ~139 weak sites
  app_controller.py            # MODIFIED: import aliases; replace ~86 weak sites
  models.py                    # MODIFIED: import aliases; replace ~51 weak sites
  api_hook_client.py           # MODIFIED: import aliases; replace ~32 weak sites
  project_manager.py           # MODIFIED: import aliases; replace ~20 weak sites
  aggregate.py                 # MODIFIED: import aliases; replace ~17 weak sites
  mcp_client.py                # UNCHANGED (only 9 weak sites; below the threshold)

docs/
  type_registry/
    index.md                   # NEW (generated): top-level TOCs
    type_aliases.md            # NEW (generated): the 10 TypeAliases + 1 NamedTuple
    ai_client.md               # NEW (generated): per-source-file reference
    app_controller.md          # NEW (generated)
    models.md                  # NEW (generated)
    api_hook_client.md         # NEW (generated)
    project_manager.md         # NEW (generated)
    aggregate.md               # NEW (generated)
    result_types.md            # NEW (generated): from data_oriented_error_handling_20260606

conductor/
  product-guidelines.md        # MODIFIED: new "Data Structure Conventions" section
  code_styleguides/
    type_aliases.md            # NEW: the canonical reference

scripts/
  audit_weak_types.py          # already committed in 84fd9ac9; runs as CI gate
  generate_type_registry.py    # NEW: AST-based registry generator

tests/
  test_type_aliases.py         # NEW: verify the aliases import and resolve to the right types
  test_generate_type_registry.py # NEW: verify the generator's regex/AST patterns and output format
  (existing test files):       # MODIFIED: update the 6 files; existing tests should pass unchanged

3.5 Coexistence with Result[T] and ErrorInfo

The new Metadata family aliases are VALUE-LEVEL types (what's in a dict). The Result[T] from data_oriented_error_handling_20260606 is a CONTROL-LEVEL wrapper (a data struct that includes errors). They compose:

# Data-oriented error handling returns:
Result[CommsLogEntry]   # a Result wrapping a single comms log entry
Result[History]         # a Result wrapping a list of history messages
Result[FileItems]       # a Result wrapping a list of file items

# The aliases name the "T" in Result[T], not the Result itself.

This is consistent: Result is a generic that wraps any data type. Naming the data types (via TypeAlias) makes the generic concrete without changing the Result pattern.

3.6 Type Registry (Auto-Generated Docs)

scripts/generate_type_registry.py is a new AST-based tool that reads src/ and writes docs/type_registry/. It runs as part of track completion (manually by the coding agent) and as a CI --check (automated).

Output structure:

docs/type_registry/
  index.md              # top-level: full table of contents + summary
  type_aliases.md       # the 10 TypeAliases from src/type_aliases.py
  ai_client.md          # per-source-file: all dataclasses, NamedTuples, TypeAliases defined or used here
  app_controller.md
  models.md
  api_hook_client.md
  project_manager.md
  aggregate.md
  ...
  (one .md per source file that has structs)

Script behavior:

# Generate / regenerate the registry (default mode)
python scripts/generate_type_registry.py

# Verify the registry is up-to-date (CI mode; exits 1 if drift)
python scripts/generate_type_registry.py --check

# Dry run: print what would change without writing
python scripts/generate_type_registry.py --diff

For each @dataclass in src/, the script writes a section like:

## `src/models.py::Ticket`

**Kind:** `@dataclass`
**Fields:**
- `id: str` — unique ticket identifier
- `title: str` — human-readable title
- `status: str = "todo"` — current status
- `priority: int = 0` — priority for queue ordering
- `created_at: datetime.datetime` — when created
- `dependencies: list[str] = field(default_factory=list)` — ticket IDs this depends on
- `metadata: Metadata` — opaque key-value metadata (see type_aliases.md)

(Note: docstrings on fields are extracted from the source to provide the "—" descriptions. Fields without docstrings are documented with their name only.)

For each TypeAlias, the script writes a section like:

## `src/type_aliases.py::CommsLogEntry`

**Kind:** `TypeAlias`
**Resolves to:** `Metadata`
**Used by:** `_comms_log`, `_append_comms`, `get_comms_log`, `comms_log_callback`, ...

**Note:** `CommsLogEntry` is a semantic alias for `Metadata`. For the canonical field semantics, see [`Metadata`](#metadata) (which is itself a generic `dict[str, Any]` until a future track converts it to a `TypedDict`).

For each NamedTuple, the script writes a section like:

## `src/type_aliases.py::FileItemsDiff`

**Kind:** `NamedTuple`
**Fields:**
- `refreshed: FileItems` — items whose mtime was checked and content re-read
- `changed: FileItems` — items whose content actually changed (subset of refreshed)

For each function that returns a structured type, the script documents the return type signature (using ast.unparse on the return annotation).

3.7 Why Per-Source-File Docs (not one giant file)

A per-source-file layout matches the project's per-source-file guide structure (docs/guide_ai_client.md, docs/guide_mcp_client.md, etc.). The coding agent reads docs/type_registry/ai_client.md when working in src/ai_client.py — locality of reference. The index.md provides the cross-cutting view.

The "token cost we eat" per LLM query is bounded: a typical source file's registry is 200-500 lines of markdown. The LLM reads it once and caches the schema in context. Subsequent references to the same types don't re-fetch.

4. Per-File Refactor Plan

4.1 src/ai_client.py (139 sites — largest offender)

Pattern: _anthropic_history: list[dict[str, Any]] (and 5 sibling histories), _comms_log: deque[dict[str, Any]], get_comms_log -> list[dict[str, Any]], _build_anthropic_tools -> list[dict[str, Any]], _reread_file_items -> tuple[list[...], list[...]], etc.

Refactor strategy:

  • Replace all 79 dict[str, Any] / Dict[str, Any] with Metadata or the more specific alias.
  • Replace all 56 list[dict[...]] with CommsLog / History / FileItems / ToolDefinitions based on the SEMANTIC ROLE of the list.
  • 2 Optional[List[Dict[...]]] with Optional[FileItems] (the _CACHED_ANTHROPIC_TOOLS is an Optional[ToolDefinitions]).
  • 2 tuple-return literal returns: the cast(...) patterns in _dispatch_tool. Replace with ToolCall extraction.

Naming heuristic: for each list of dicts, look at the variable name + the function name to determine the semantic role. E.g., _comms_logCommsLog; _anthropic_historyHistory; _build_anthropic_toolsToolDefinitions; _reread_file_items(file_items: list[...])FileItems.

4.2 src/app_controller.py (86 sites)

Pattern: _pending_dialog: Optional[ConfirmDialog] = None (stays as-is; this is a STRONG type already), last_error: Optional[Dict[str, str]] = None (could be Optional[ErrorInfo] from the data_oriented track), but most weak sites are in the Hook API request/response payloads and the pre_tool_callback family.

Refactor strategy:

  • The 62 dict_str_any sites: replace with Metadata or CommsLogEntry based on context.
  • The 20 list_of_dict sites: replace with the appropriate alias.
  • The 4 optional_dict sites: replace with Optional[Metadata] (or Optional[CommsLogEntry] if the context is the hook request payload).

4.3 src/models.py (51 sites)

Pattern: Dataclass fields. E.g., script: Optional[str] = None (stays as-is; STRONG), but also target_file: Optional[str] = None and many fields where the type is Optional[Dict[str, Any]] (in dataclass fields).

Refactor strategy: Replace 48 dict_str_any with Optional[Metadata]; 3 list_of_dict with the appropriate alias.

4.4 src/api_hook_client.py (32 sites)

Pattern: HTTP request/response payloads. E.g., payload: Dict[str, Any], data: dict[str, Any].

Refactor strategy: 30 dict_str_anyMetadata; 2 list_of_dictlist[Metadata].

4.5 src/project_manager.py (20 sites)

Pattern: TOML config dicts. E.g., proj: dict[str, Any], data: dict[str, Any].

Refactor strategy: 16 dict_str_anyMetadata; 3 list_of_dictlist[Metadata]; 1 optional_dictOptional[Metadata].

4.6 src/aggregate.py (17 sites)

Pattern: Aggregation result dicts. E.g., result: dict[str, list[dict[str, Any]]].

Refactor strategy: 10 dict_str_anyMetadata; 7 list_of_dict → appropriate alias.

4.7 Phase 2 NamedTuple conversions

  • _reread_file_items in src/ai_client.py (returns Tuple[List[FileItem], List[FileItem]]) → returns FileItemsDiff. Affects ~3-4 call sites.
  • 1-2 screen-coord tuples (1-occurrence each) — opportunistic. If the call site is clear and the names are obvious, convert; otherwise leave.

5. The Audit Script as a Permanent CI Gate

After this track, the audit script becomes a permanent CI gate. scripts/audit_weak_types.py exits 0 even when findings exist (it's informational). The CI gate uses a stricter mode:

# New mode: --strict, exits 1 if any new weak site is added in a PR
python scripts/audit_weak_types.py --strict

The --strict mode compares the current count to a baseline (stored in scripts/audit_weak_types.baseline.json). If the current count is HIGHER than the baseline, exit 1. The baseline is regenerated after this track to the post-refactor count (~60 findings, only the 23 lower-impact files remain).

This is documented in the spec but the actual --strict mode is implemented as part of the track (Phase 1 final task). Future PRs that introduce new dict[str, Any] or anonymous tuples will fail CI.

6. Configuration

No new dependencies. No new environment variables. No new config files.

The aliases live in src/type_aliases.py (pure stdlib typing.TypeAlias).

7. Testing Strategy

Test File Purpose Coverage Target
tests/test_type_aliases.py Verify the aliases import; verify they resolve to the expected types; verify they compose with Result[T] (e.g., Result[FileItems] is a valid generic). 100%
tests/test_audit_weak_types.py Verify the audit script's regex patterns are correct; verify the Finding dataclass is populated correctly; verify the report matches expectations. 90%
tests/test_ai_client.py (existing) Verify no regressions after the 139-site replacement. 100% (regression)
tests/test_app_controller.py (existing) Verify no regressions after the 86-site replacement. 100% (regression)
tests/test_models.py (existing) Verify no regressions after the 51-site replacement. 100% (regression)
tests/test_api_hook_client.py (existing) Verify no regressions after the 32-site replacement. 100% (regression)
tests/test_project_manager.py (existing) Verify no regressions after the 20-site replacement. 100% (regression)
tests/test_aggregate.py (existing) Verify no regressions after the 17-site replacement. 100% (regression)
tests/test_mcp_client.py (existing) Verify no regressions. (mcp_client is unchanged but the aliases may be adopted opportunistically in Phase 1.5 if convenient.) 100% (regression)

Mocking strategy: Existing tests use unittest.mock.patch; no changes needed.

Audit baseline check: After Phase 1, the audit script should report 0 NEW findings (the count may go UP if a few sites were missed, but the trend is DOWN). After Phase 2, the count should be at or below the pre-track baseline minus 50 (the targeted reductions).

8. Migration / Rollout

Phase What Risk
Phase 1 — Aliases + 6-file replacement + audit baseline Add src/type_aliases.py. Add tests/test_type_aliases.py. Mechanical replacement in 6 files. Add --strict mode to the audit script. Generate the new baseline. Medium. ~345 sites of mechanical replacement. Mitigated by existing test coverage.
Phase 2 — NamedTuples + type registry generator + initial docs + archive Convert 2-3 tuple returns to NamedTuples. Add scripts/generate_type_registry.py + the initial generated registry in docs/type_registry/. Add tests for the generator. Add conductor/code_styleguides/type_aliases.md and update product-guidelines.md. Manual smoke test. Archive the track. Low. ~3-4 sites of tuple conversion. Generator is a self-contained AST tool. Docs-only changes.

Each phase has its own checkpoint commit and git note.

9. Risks & Mitigations

Risk Likelihood Impact Mitigation
Mechanical replacement misses a few sites; the count doesn't drop as expected. Medium Low The audit script is the source of truth. Re-run after Phase 1; investigate any anomalies.
Renaming dict[str, Any] to Metadata (or another alias) changes how some tests introspect types (e.g., isinstance(x, dict)). Low Medium The aliases are TYPE-LEVEL ONLY; at runtime, Metadata IS dict[str, Any] IS dict. isinstance(x, dict) continues to work. Test cases that use get_type_hints() may need updating; documented in the test plan.
A future contributor adds a new dict[str, Any] and the audit script doesn't catch it. Low Low The audit script's regex patterns are exhaustive for the current 430 findings. New patterns (e.g., a new Mapping[str, Any]) would be missed. The track documents the patterns the script knows; future contributions of new patterns warrant extending the script.
The aliases conflict with the Result[T] and ErrorInfo from the data_oriented_error_handling track. Low Low The aliases are VALUE-LEVEL (data types); Result and ErrorInfo are CONTROL-LEVEL (wrappers). They compose: Result[FileItems] is valid. No conflict.
The 6-file mechanical replacement is too large to review in one PR. Medium Low Phase 1 is split into 6 sub-tasks (one per file) in the plan, each with its own commit. Reviewers can review file-by-file.
The 23 lower-impact files are NEVER migrated. High Low (acceptable) The audit script stays in the codebase as a permanent CI gate. The cost of ignoring the 23 files is now VISIBLE. Future tracks can pick them up opportunistically.
The docs/type_registry/ docs drift from the actual code. Medium Medium (LLM reads stale info) The --check mode of the generator exits 1 if the registry would change. The coding agent runs the generator before each track's commit. A follow-up track (type_registry_ci_20260606) will wire --check into CI.

10. Out of Scope (Explicit)

  • TypedDict / @dataclass migration of the Metadata family. The type registry (added in Phase 2) captures the field information in docs form, with much lower upfront cost than TypedDict migration. A future track MAY convert the most-used aliases to TypedDict (giving the AI schema hints via type hints instead of via docs); this is a separate decision.
  • The 23 lower-impact files (those with 1-9 weak sites each). Deferred; will be addressed opportunistically or in a future incremental track.
  • Adding pydantic models. Not requested; would be a much larger architectural decision.
  • Changing function signatures at the runtime level. The aliases are TYPE-LEVEL; runtime behavior is identical.
  • Modifying scripts/audit_weak_types.py's regex patterns. The patterns are correct for the current findings. If new patterns emerge, a future track can extend the script.
  • Migrating the data_oriented_error_handling_20260606 track's src/result_types.py aliases. The 2 type-aliases modules are SEPARATE: result_types.py has ErrorInfo / Result / ErrorKind; type_aliases.py has Metadata / CommsLog / FileItem / etc. They don't overlap.

11. Open Questions

  1. The 6 aliases or 4? The 6 listed in §3.1 are: Metadata, CommsLogEntry, CommsLog, HistoryMessage, History, FileItem, FileItems, ToolDefinition, ToolCall, CommsLogCallback. That's 10. Should we cut to 4-6 to minimize the AI vocabulary? (Proposal: keep all 10; they're each named for a distinct concept, and the 10 names are self-explanatory. The "vocabulary cost" is the same as adding 10 new function names to a module — well within normal Python codebase scale.)
  2. Should FileItem and ToolDefinition be TypedDict from the start? A TypedDict gives the AI field-level hints, not just a name. But introducing TypedDict requires knowing the FIELDS, which is a deeper semantic task. (Proposal: Phase 1 uses TypeAlias = dict[str, Any]; Phase 2 of a future track converts to TypedDict. Keeps the current track scope tight.)
  3. Should the audit script enforce a count threshold (e.g., "no more than 100 weak sites total") or a per-file threshold (e.g., "no file may have more than 50 weak sites")? (Proposal: per-file threshold is more actionable. A future PR that introduces 20 new dict[str, Any] in foo.py would fail even if the total count didn't increase.)

12. See Also

12.1 Follow-up Track (planned; not in this spec)

"Registry Maintenance & CI Integration" (type_registry_ci_20260606 or similar) — promotes the type-registry generator from a manual track-completion step to a CI gate. The track:

  • Wires python scripts/generate_type_registry.py --check into CI; the PR fails if the registry is stale.
  • Adds the registry to the per-track commit workflow: the coding agent runs the generator before marking a track complete, and includes the registry diff in the commit.
  • Optionally adds a pre-commit hook that runs the generator and stages the diff.
  • The "Type Registry Maintenance" track is the natural follow-up. Prerequisites: this track (so the generator exists and is tested).

12.2 Project References

  • scripts/audit_weak_types.py (already committed; 84fd9ac9) — the audit that found 430 weak sites.
  • docs/guide_testing.md — test conventions.
  • conductor/code_styleguides/error_handling.md (created in the data_oriented_error_handling_20260606 track) — the convention for Result types; the new type-aliases convention lives alongside.
  • conductor/product-guidelines.md "Data-Oriented Error Handling" — the convention this track extends (Data Structure Strengthening is a new top-level convention in the same family).
  • conductor/tracks/data_oriented_error_handling_20260606/ — the previous track that established the convention format; this track uses the same pattern.

12.3 External References

  • Python typing.TypeAlias — the canonical mechanism for type aliases (PEP 613, Python 3.10+).
  • Python typing.NamedTuple — for tuple-with-fields.
  • Python typing.TypedDict — for the future Phase 2 (not in this track).
  • Mike Acton on data-oriented design — the "data is the API" framing that motivates NAMING data structures clearly.
  • Casey Muratori on module layer boundaries — the convention that each module owns its data and exposes a clear interface.