Private
Public Access
0
0
Files
manual_slop/docs/reports/FOLLOWUP_module_taxonomy_20260627.md
T
ed b1ee947b32 docs(reports): FOLLOWUP_module_taxonomy_20260627 v2.1 - AGENT_TOOL_NAMES is redundant
User: 'isn't AGENT_TOOL_NAMES a redundant thing thats directly associated
with the mcp_client.py?' - YES, confirmed.

The existing test test_tool_names_subset_of_models_agent_tool_names
literally asserts: tool_names() ⊆ AGENT_TOOL_NAMES. So AGENT_TOOL_NAMES
is just a hardcoded snapshot of mcp_tool_specs.tool_names().

Action: DELETE AGENT_TOOL_NAMES from models.py (not just move it).
Derive at consumer sites: list(mcp_tool_specs.tool_names()).

8 consumer sites to update:
- 3 in src/app_controller.py:2110, 2972, 3273
- 5 in tests/test_arch_boundary_phase2.py:23, 29, 31, 32, 33

The cross-check test becomes either redundant or converts to a
positive assertion (e.g., assert that the derived list has at
least the canonical tool count).

models.py reduces further: from ~60 to ~30 lines after deletion.

This further reduces the models.py footprint. Combined with the
previous audit (move vendor files to ai_client.py, split out mma.py
+ project.py + project_files.py), models.py becomes essentially
empty - just the Pydantic proxy code that may also move to api_hooks.py.

Net effect: models.py could be ELIMINATED entirely (becomes ~0 lines
or just an __init__.py marker). The followup should consider whether
to delete models.py completely.
2026-06-26 06:14:40 -04:00

14 KiB

Module Taxonomy Audit + Refactor Plan

Date: 2026-06-27 Reviewer: Tier 1 Trigger: User directive: "if anything I want more unification. I only want splitifcation if there is a good reason such as import load times. If there isn't an import issue or definition pollution issue just keep it in the same file."


Decision rule (the user's principle)

Split a file only if ONE of:

  • Import load time: the file has heavy imports (vendored SDKs, ML models) that some code paths don't need
  • Definition pollution: the file mixes 3+ unrelated domains with 30+ classes/functions

Otherwise: keep in a single file. Move imports around, but don't fragment.

No sub-directories. All files at src/ flat with prefix naming.


TL;DR

Only TWO clear refactors are justified:

  1. MERGE 5 ImGui LEAKS into gui_2.py (clear violation of the GUI boundary)
  2. SPLIT models.py into mma.py + project.py + project_files.py (clear definition pollution; 36 classes, 5+ unrelated domains, 1044 lines)
  3. MERGE 2 vendor files into ai_client.py (per user's explicit directive)

Everything else: KEEP AS-IS. No unnecessary fragmentation.


Full audit: 65 files in src/

Group A: MERGE (5 ImGui LEAKS into gui_2.py)

User directive: "all ImGui rendering should be in gui_2.py. Only exception: imgui_scopes.py"

File Lines LEAK content Destination
src/bg_shader.py 66 ImGui background shader code gui_2.py
src/shaders.py 33 ImGui shader code gui_2.py
src/command_palette.py 165 ImGui command palette UI gui_2.py
src/diff_viewer.py 164 ImGui diff viewer UI gui_2.py
src/patch_modal.py 102 ImGui patch modal UI gui_2.py

Verification: git grep -l "imgui\\." -- 'src/*.py' should return ONLY gui_2.py + imgui_scopes.py.

Group B: MERGE (2 vendor files into ai_client.py)

User directive: "vendor_capabilities.py and vendor_state.py are related to ai_client.py... they're the ai vendoring layer."

File Lines Destination
src/vendor_capabilities.py 85 ai_client.py (add as section "Vendor Capabilities")
src/vendor_state.py 78 ai_client.py (add as section "Vendor State")

ai_client.py grows from 3147 → ~3310 lines. Justified: these ARE the vendor layer per user; keeping them split is fragmenting a single domain.

Group C: SPLIT (models.py is the only clear definition pollution)

models.py = 1044 lines, 36 classes, 5+ unrelated domains. Justified split.

The new taxonomy:

New file What it gets Lines (est.)
src/mma.py MMA Core + TrackState: ThinkingSegment, Ticket, Track, WorkerContext, TrackState ~250
src/project.py ProjectContext + 5 sub-dataclasses + config I/O (_clean_nones, load_config_from_disk, save_config_to_disk, parse_history_entries) ~200
src/project_files.py FileItem, ContextPreset, ContextFileEntry, NamedViewPreset, Preset ~150

Classes that merge into EXISTING sub-system files (not new files):

Class from models.py Destination (existing file)
Persona src/personas.py (93 lines, exists)
Tool, ToolPreset src/tool_presets.py (123 lines, exists)
BiasProfile src/tool_bias.py (63 lines, exists)
TextEditorConfig, ExternalEditorConfig src/external_editor.py (129 lines, exists)
MCPServerConfig, MCPConfiguration, VectorStoreConfig, RAGConfig, load_mcp_config src/mcp_client.py (1803 lines, exists)
WorkspaceProfile src/workspace_manager.py (73 lines, exists)

src/models.py reduced to:

  • _create_generate_request, _create_confirm_request, __getattr__ (Pydantic lazy proxies for the API; could also move to api_hooks.py if they're truly API-specific)
  • Top-level docstring updated to reflect the new scope

AGENT_TOOL_NAMES is REDUNDANT — DELETE it (not just move). It's a hardcoded snapshot of mcp_tool_specs.tool_names(). The existing test test_tool_names_subset_of_models_agent_tool_names literally asserts tool_names() ⊆ AGENT_TOOL_NAMES. Derive the list at consumer sites: list(mcp_tool_specs.tool_names()). Update 8 consumer sites (3 in app_controller.py + 5 in tests/test_arch_boundary_phase2.py). The cross-check test becomes either redundant or converts to a positive assertion that the set is derived correctly.

Estimated: ~30 lines (down from 1044, down from 60 if you keep the redundant constant).

Group D: KEEP AS-IS (the rest)

All remaining files have clear single responsibilities. No reason to split:

Category Files Total lines
Core types paths.py, result_types.py, type_aliases.py 523
AI vendor (unified) ai_client.py (with vendor_*.py merged) 3310
MMA (mostly) multi_agent_conductor.py, dag_engine.py, conductor_tech_lead.py, orchestrator_pm.py, mma_prompts.py, events.py 1369
MCP mcp_client.py (with config merged), mcp_tool_specs.py, beads_client.py 1978
Project (unified) project_manager.py (main), presets.py, context_presets.py, project.py (NEW), project_files.py (NEW) ~900
GUI (unified) gui_2.py (with ImGui LEAKS merged), imgui_scopes.py (EXCEPTION per user) ~8300
Theme theme_2.py, theme_models.py, theme_nerv_fx.py, theme_nerv.py 728
Tool/persona/editor/mcp config (merged) tool_presets.py, tool_bias.py, personas.py, external_editor.py, workspace_manager.py ~500
API hook api_hooks.py, api_hook_client.py, api_hooks_helpers.py 1480
Infra log_registry.py, log_pruner.py, session_logger.py, history.py, warmup.py, startup_profiler.py, performance_monitor.py, io_pool.py, module_loader.py, shell_runner.py, hot_reloader.py, summary_cache.py, summarize.py, synthesis_formatter.py, fuzzy_anchor.py, outline_tool.py, file_cache.py, aggregate.py ~3700

Why this taxonomy (per the user's principle)

MERGE actions (3 files moved, 5 deleted):

Action Files deleted Justification
ImGui LEAKS → gui_2.py 5 deleted Clear violation of GUI boundary (user directive)
Vendor files → ai_client.py 2 deleted User explicit directive; unified vendor layer

SPLIT actions (1 file split into 3):

Action New files Justification
models.py split mma.py + project.py + project_files.py Definition pollution (5+ domains, 36 classes, 1044 lines)
Other models.py classes merged into existing files (none new) Persona/Tool/Editor/MCP/Workspace already have their own files; just merge in the dataclass

KEEP actions (52 files unchanged):

No reason to split. They're either:

  • Single-domain files (e.g., log_registry.py is just session log registration)
  • Already have natural boundaries (e.g., theme_*.py files are theme-specific, not polluted)
  • Don't have import load time issues (e.g., multi_agent_conductor.py is MMA-specific but doesn't pull in heavy SDKs at import time)

Refactor Plan (5 phases, atomic commits per group)

Phase 1: Move ImGui LEAKS into gui_2.py (5 commits)

For each of bg_shader.py, shaders.py, command_palette.py, diff_viewer.py, patch_modal.py:

  1. Read source file
  2. Add content to gui_2.py (in a clearly-marked section)
  3. Update imports across the codebase (replace from src.bg_shader import X with from src.gui_2 import X)
  4. Delete the original file via git rm
  5. Verify all affected tests pass

Phase 2: Merge vendor files into ai_client.py (2 commits)

For each of vendor_capabilities.py, vendor_state.py:

  1. Read source file
  2. Add content to ai_client.py (in a clearly-marked section "Vendor Capabilities" / "Vendor State")
  3. Update imports across the codebase
  4. Delete the original file via git rm
  5. Verify all affected tests pass

Phase 3: Split models.py into mma.py + project.py + project_files.py (3 commits + 6 merges)

  1. Create src/mma.py with MMA Core + TrackState (from models.py)
  2. Create src/project.py with ProjectContext + sub + config I/O (from models.py)
  3. Create src/project_files.py with file-related dataclasses (from models.py)
  4. Merge Persona into personas.py
  5. Merge Tool, ToolPreset into tool_presets.py
  6. Merge BiasProfile into tool_bias.py
  7. Merge TextEditorConfig, ExternalEditorConfig into external_editor.py
  8. Merge MCP config dataclasses into mcp_client.py
  9. Merge WorkspaceProfile into workspace_manager.py
  10. Reduce models.py to ~60 lines (Pydantic proxies + AGENT_TOOL_NAMES only)
  11. Update all 136 import sites for the moved classes

Phase 4: Verify all 7 audit gates pass --strict (1 commit, no code changes)

Phase 5: End-of-track (2 commits: report + state)


Risks

# Risk Likelihood Mitigation
R1 ImGui LEAKS move breaks existing tests low Run full affected test set after each move; revert + fix on regression
R2 Vendor merge into ai_client.py creates circular imports medium Vendor code uses vendor client holder from ai_client.py; both should already be in the same module hierarchy; if circular, the vendor_capabilities.py lazy import pattern (PROVIDERS) is the workaround
R3 models.py split breaks 136 import sites high The split is mechanical but invasive; per-file move with regression-guard tests after each
R4 The ProviderPayload / UIPanelConfig / PathInfo classes from metadata_promotion_20260624 are in models.py per that track high These were added AFTER my taxonomy audit. Need to also move them to the right home (probably project.py or split into separate files)

Acceptance Criteria (10 VCs)

# Criterion Verification
VC1 ImGui imports limited to gui_2.py + imgui_scopes.py git grep -l "imgui_bundle|from imgui\." HEAD -- 'src/*.py' returns 2 files
VC2 src/bg_shader.py, src/shaders.py, src/command_palette.py, src/diff_viewer.py, src/patch_modal.py deleted ls src/{bg_shader,shaders,command_palette,diff_viewer,patch_modal}.py returns not-found
VC3 src/vendor_capabilities.py, src/vendor_state.py deleted ls src/{vendor_capabilities,vendor_state}.py returns not-found
VC4 Vendor symbols importable from src.ai_client python -c "from src.ai_client import PROVIDER_CAPABILITIES, get_vendor_state"
VC5 src/mma.py exists with MMA Core + TrackState python -c "from src.mma import ThinkingSegment, Ticket, Track, WorkerContext, TrackState"
VC6 src/project.py exists with ProjectContext + sub + config I/O python -c "from src.project import ProjectContext, ProjectMeta, ProjectOutput, ProjectFiles, ProjectScreenshots, ProjectDiscussion, _clean_nones, load_config_from_disk, save_config_to_disk, parse_history_entries"
VC7 src/project_files.py exists with file-related dataclasses python -c "from src.project_files import FileItem, ContextPreset, ContextFileEntry, NamedViewPreset, Preset"
VC8 Persona/Tool/Editor/MCP/Workspace dataclasses in their proper sub-system files python -c "from src.personas import Persona; from src.tool_presets import Tool, ToolPreset; from src.tool_bias import BiasProfile; from src.external_editor import TextEditorConfig, ExternalEditorConfig; from src.mcp_client import MCPServerConfig, MCPConfiguration, VectorStoreConfig, RAGConfig, load_mcp_config; from src.workspace_manager import WorkspaceProfile"
VC9 src/models.py reduced to <100 lines (only Pydantic proxies + AGENT_TOOL_NAMES) wc -l src/models.py returns < 100
VC10 All 7 audit gates pass --strict same as current baseline

Scope summary

Operation Files affected Net change
DELETE 7 (5 ImGui + 2 vendor) -7 files
CREATE 3 (mma.py, project.py, project_files.py) +3 files
MODIFY 7 (ai_client.py, gui_2.py, personas.py, tool_presets.py, tool_bias.py, external_editor.py, mcp_client.py, workspace_manager.py) + reduce models.py 8 files modified
TOTAL 17 file changes; net -4 files -4 files

Before: 65 files in src/ After: 61 files in src/ (with cleaner taxonomy)


Open question: rename existing files for prefix consistency?

The user said "top-level prefix for modules that cannot have their definitions in the single file". Renames are NOT required (the user wants minimal splitting). But for naming consistency, some renames MIGHT be considered:

Current name Suggested rename Reason
mma_prompts.py (keep) Already prefixed
multi_agent_conductor.py mma_conductor.py For consistency with mma_prompts.py
dag_engine.py mma_dag.py Same
conductor_tech_lead.py mma_tech_lead.py Same
orchestrator_pm.py mma_pm.py Same
events.py (keep) Generic, not MMA-specific
gemini_cli_adapter.py (keep) Per user: don't split ai_client.py; the adapter is its own concern
qwen_adapter.py (keep) Same
mcp_tool_specs.py (keep) Already prefixed
beads_client.py (keep) Beads is its own concern (separate from MCP client)

Recommendation: do the renames as a SEPARATE phase if desired. They improve clarity but are not strictly necessary. The user's main complaint is the dumping-ground problem (models.py), not naming convention.


Recommendation

Execute the 5-phase refactor. The 3 substantive phases (Phase 1 ImGui merge, Phase 2 vendor merge, Phase 3 models split) are all justified. The renaming in the Open Question section is OPTIONAL — defer to a follow-up if the user wants.

The user should approve this plan before Tier 2/3 starts executing. The plan is conservative: only moves that have a clear "good reason" per the user's principle. Everything else stays put.


See also

  • docs/reports/FOLLOWUP_module_taxonomy_20260627.md — previous taxonomy discussion (this document is the revised version)
  • AGENTS.md — "File Size and Naming Convention" HARD RULE
  • conductor/code_styleguides/data_oriented_design.md — "Prefer Fewer Types" principle
  • src/models.py — current 1044-line dumping ground
  • conductor/tracks/cruft_elimination_20260627/SPEC_CORRECTION_phase_2.md — Phase 2 spec correction (related to project.py refactor)