User: 'isn't AGENT_TOOL_NAMES a redundant thing thats directly associated with the mcp_client.py?' - YES, confirmed. The existing test test_tool_names_subset_of_models_agent_tool_names literally asserts: tool_names() ⊆ AGENT_TOOL_NAMES. So AGENT_TOOL_NAMES is just a hardcoded snapshot of mcp_tool_specs.tool_names(). Action: DELETE AGENT_TOOL_NAMES from models.py (not just move it). Derive at consumer sites: list(mcp_tool_specs.tool_names()). 8 consumer sites to update: - 3 in src/app_controller.py:2110, 2972, 3273 - 5 in tests/test_arch_boundary_phase2.py:23, 29, 31, 32, 33 The cross-check test becomes either redundant or converts to a positive assertion (e.g., assert that the derived list has at least the canonical tool count). models.py reduces further: from ~60 to ~30 lines after deletion. This further reduces the models.py footprint. Combined with the previous audit (move vendor files to ai_client.py, split out mma.py + project.py + project_files.py), models.py becomes essentially empty - just the Pydantic proxy code that may also move to api_hooks.py. Net effect: models.py could be ELIMINATED entirely (becomes ~0 lines or just an __init__.py marker). The followup should consider whether to delete models.py completely.
14 KiB
Module Taxonomy Audit + Refactor Plan
Date: 2026-06-27 Reviewer: Tier 1 Trigger: User directive: "if anything I want more unification. I only want splitifcation if there is a good reason such as import load times. If there isn't an import issue or definition pollution issue just keep it in the same file."
Decision rule (the user's principle)
Split a file only if ONE of:
- Import load time: the file has heavy imports (vendored SDKs, ML models) that some code paths don't need
- Definition pollution: the file mixes 3+ unrelated domains with 30+ classes/functions
Otherwise: keep in a single file. Move imports around, but don't fragment.
No sub-directories. All files at src/ flat with prefix naming.
TL;DR
Only TWO clear refactors are justified:
- MERGE 5 ImGui LEAKS into
gui_2.py(clear violation of the GUI boundary) - SPLIT
models.pyintomma.py+project.py+project_files.py(clear definition pollution; 36 classes, 5+ unrelated domains, 1044 lines) - MERGE 2 vendor files into
ai_client.py(per user's explicit directive)
Everything else: KEEP AS-IS. No unnecessary fragmentation.
Full audit: 65 files in src/
Group A: MERGE (5 ImGui LEAKS into gui_2.py)
User directive: "all ImGui rendering should be in gui_2.py. Only exception: imgui_scopes.py"
| File | Lines | LEAK content | Destination |
|---|---|---|---|
src/bg_shader.py |
66 | ImGui background shader code | → gui_2.py |
src/shaders.py |
33 | ImGui shader code | → gui_2.py |
src/command_palette.py |
165 | ImGui command palette UI | → gui_2.py |
src/diff_viewer.py |
164 | ImGui diff viewer UI | → gui_2.py |
src/patch_modal.py |
102 | ImGui patch modal UI | → gui_2.py |
Verification: git grep -l "imgui\\." -- 'src/*.py' should return ONLY gui_2.py + imgui_scopes.py.
Group B: MERGE (2 vendor files into ai_client.py)
User directive: "vendor_capabilities.py and vendor_state.py are related to ai_client.py... they're the ai vendoring layer."
| File | Lines | Destination |
|---|---|---|
src/vendor_capabilities.py |
85 | → ai_client.py (add as section "Vendor Capabilities") |
src/vendor_state.py |
78 | → ai_client.py (add as section "Vendor State") |
ai_client.py grows from 3147 → ~3310 lines. Justified: these ARE the vendor layer per user; keeping them split is fragmenting a single domain.
Group C: SPLIT (models.py is the only clear definition pollution)
models.py = 1044 lines, 36 classes, 5+ unrelated domains. Justified split.
The new taxonomy:
| New file | What it gets | Lines (est.) |
|---|---|---|
src/mma.py |
MMA Core + TrackState: ThinkingSegment, Ticket, Track, WorkerContext, TrackState | ~250 |
src/project.py |
ProjectContext + 5 sub-dataclasses + config I/O (_clean_nones, load_config_from_disk, save_config_to_disk, parse_history_entries) |
~200 |
src/project_files.py |
FileItem, ContextPreset, ContextFileEntry, NamedViewPreset, Preset | ~150 |
Classes that merge into EXISTING sub-system files (not new files):
Class from models.py |
Destination (existing file) |
|---|---|
Persona |
src/personas.py (93 lines, exists) |
Tool, ToolPreset |
src/tool_presets.py (123 lines, exists) |
BiasProfile |
src/tool_bias.py (63 lines, exists) |
TextEditorConfig, ExternalEditorConfig |
src/external_editor.py (129 lines, exists) |
MCPServerConfig, MCPConfiguration, VectorStoreConfig, RAGConfig, load_mcp_config |
src/mcp_client.py (1803 lines, exists) |
WorkspaceProfile |
src/workspace_manager.py (73 lines, exists) |
src/models.py reduced to:
_create_generate_request,_create_confirm_request,__getattr__(Pydantic lazy proxies for the API; could also move toapi_hooks.pyif they're truly API-specific)- Top-level docstring updated to reflect the new scope
AGENT_TOOL_NAMES is REDUNDANT — DELETE it (not just move). It's a hardcoded snapshot of mcp_tool_specs.tool_names(). The existing test test_tool_names_subset_of_models_agent_tool_names literally asserts tool_names() ⊆ AGENT_TOOL_NAMES. Derive the list at consumer sites: list(mcp_tool_specs.tool_names()). Update 8 consumer sites (3 in app_controller.py + 5 in tests/test_arch_boundary_phase2.py). The cross-check test becomes either redundant or converts to a positive assertion that the set is derived correctly.
Estimated: ~30 lines (down from 1044, down from 60 if you keep the redundant constant).
Group D: KEEP AS-IS (the rest)
All remaining files have clear single responsibilities. No reason to split:
| Category | Files | Total lines |
|---|---|---|
| Core types | paths.py, result_types.py, type_aliases.py |
523 |
| AI vendor (unified) | ai_client.py (with vendor_*.py merged) |
3310 |
| MMA (mostly) | multi_agent_conductor.py, dag_engine.py, conductor_tech_lead.py, orchestrator_pm.py, mma_prompts.py, events.py |
1369 |
| MCP | mcp_client.py (with config merged), mcp_tool_specs.py, beads_client.py |
1978 |
| Project (unified) | project_manager.py (main), presets.py, context_presets.py, project.py (NEW), project_files.py (NEW) |
~900 |
| GUI (unified) | gui_2.py (with ImGui LEAKS merged), imgui_scopes.py (EXCEPTION per user) |
~8300 |
| Theme | theme_2.py, theme_models.py, theme_nerv_fx.py, theme_nerv.py |
728 |
| Tool/persona/editor/mcp config (merged) | tool_presets.py, tool_bias.py, personas.py, external_editor.py, workspace_manager.py |
~500 |
| API hook | api_hooks.py, api_hook_client.py, api_hooks_helpers.py |
1480 |
| Infra | log_registry.py, log_pruner.py, session_logger.py, history.py, warmup.py, startup_profiler.py, performance_monitor.py, io_pool.py, module_loader.py, shell_runner.py, hot_reloader.py, summary_cache.py, summarize.py, synthesis_formatter.py, fuzzy_anchor.py, outline_tool.py, file_cache.py, aggregate.py |
~3700 |
Why this taxonomy (per the user's principle)
MERGE actions (3 files moved, 5 deleted):
| Action | Files deleted | Justification |
|---|---|---|
ImGui LEAKS → gui_2.py |
5 deleted | Clear violation of GUI boundary (user directive) |
Vendor files → ai_client.py |
2 deleted | User explicit directive; unified vendor layer |
SPLIT actions (1 file split into 3):
| Action | New files | Justification |
|---|---|---|
models.py split |
mma.py + project.py + project_files.py |
Definition pollution (5+ domains, 36 classes, 1044 lines) |
| Other models.py classes merged into existing files | (none new) | Persona/Tool/Editor/MCP/Workspace already have their own files; just merge in the dataclass |
KEEP actions (52 files unchanged):
No reason to split. They're either:
- Single-domain files (e.g.,
log_registry.pyis just session log registration) - Already have natural boundaries (e.g.,
theme_*.pyfiles are theme-specific, not polluted) - Don't have import load time issues (e.g.,
multi_agent_conductor.pyis MMA-specific but doesn't pull in heavy SDKs at import time)
Refactor Plan (5 phases, atomic commits per group)
Phase 1: Move ImGui LEAKS into gui_2.py (5 commits)
For each of bg_shader.py, shaders.py, command_palette.py, diff_viewer.py, patch_modal.py:
- Read source file
- Add content to
gui_2.py(in a clearly-marked section) - Update imports across the codebase (replace
from src.bg_shader import Xwithfrom src.gui_2 import X) - Delete the original file via
git rm - Verify all affected tests pass
Phase 2: Merge vendor files into ai_client.py (2 commits)
For each of vendor_capabilities.py, vendor_state.py:
- Read source file
- Add content to
ai_client.py(in a clearly-marked section "Vendor Capabilities" / "Vendor State") - Update imports across the codebase
- Delete the original file via
git rm - Verify all affected tests pass
Phase 3: Split models.py into mma.py + project.py + project_files.py (3 commits + 6 merges)
- Create
src/mma.pywith MMA Core + TrackState (from models.py) - Create
src/project.pywith ProjectContext + sub + config I/O (from models.py) - Create
src/project_files.pywith file-related dataclasses (from models.py) - Merge
Personaintopersonas.py - Merge
Tool,ToolPresetintotool_presets.py - Merge
BiasProfileintotool_bias.py - Merge
TextEditorConfig,ExternalEditorConfigintoexternal_editor.py - Merge MCP config dataclasses into
mcp_client.py - Merge
WorkspaceProfileintoworkspace_manager.py - Reduce
models.pyto ~60 lines (Pydantic proxies + AGENT_TOOL_NAMES only) - Update all 136 import sites for the moved classes
Phase 4: Verify all 7 audit gates pass --strict (1 commit, no code changes)
Phase 5: End-of-track (2 commits: report + state)
Risks
| # | Risk | Likelihood | Mitigation |
|---|---|---|---|
| R1 | ImGui LEAKS move breaks existing tests | low | Run full affected test set after each move; revert + fix on regression |
| R2 | Vendor merge into ai_client.py creates circular imports |
medium | Vendor code uses vendor client holder from ai_client.py; both should already be in the same module hierarchy; if circular, the vendor_capabilities.py lazy import pattern (PROVIDERS) is the workaround |
| R3 | models.py split breaks 136 import sites |
high | The split is mechanical but invasive; per-file move with regression-guard tests after each |
| R4 | The ProviderPayload / UIPanelConfig / PathInfo classes from metadata_promotion_20260624 are in models.py per that track |
high | These were added AFTER my taxonomy audit. Need to also move them to the right home (probably project.py or split into separate files) |
Acceptance Criteria (10 VCs)
| # | Criterion | Verification |
|---|---|---|
| VC1 | ImGui imports limited to gui_2.py + imgui_scopes.py |
git grep -l "imgui_bundle|from imgui\." HEAD -- 'src/*.py' returns 2 files |
| VC2 | src/bg_shader.py, src/shaders.py, src/command_palette.py, src/diff_viewer.py, src/patch_modal.py deleted |
ls src/{bg_shader,shaders,command_palette,diff_viewer,patch_modal}.py returns not-found |
| VC3 | src/vendor_capabilities.py, src/vendor_state.py deleted |
ls src/{vendor_capabilities,vendor_state}.py returns not-found |
| VC4 | Vendor symbols importable from src.ai_client |
python -c "from src.ai_client import PROVIDER_CAPABILITIES, get_vendor_state" |
| VC5 | src/mma.py exists with MMA Core + TrackState |
python -c "from src.mma import ThinkingSegment, Ticket, Track, WorkerContext, TrackState" |
| VC6 | src/project.py exists with ProjectContext + sub + config I/O |
python -c "from src.project import ProjectContext, ProjectMeta, ProjectOutput, ProjectFiles, ProjectScreenshots, ProjectDiscussion, _clean_nones, load_config_from_disk, save_config_to_disk, parse_history_entries" |
| VC7 | src/project_files.py exists with file-related dataclasses |
python -c "from src.project_files import FileItem, ContextPreset, ContextFileEntry, NamedViewPreset, Preset" |
| VC8 | Persona/Tool/Editor/MCP/Workspace dataclasses in their proper sub-system files | python -c "from src.personas import Persona; from src.tool_presets import Tool, ToolPreset; from src.tool_bias import BiasProfile; from src.external_editor import TextEditorConfig, ExternalEditorConfig; from src.mcp_client import MCPServerConfig, MCPConfiguration, VectorStoreConfig, RAGConfig, load_mcp_config; from src.workspace_manager import WorkspaceProfile" |
| VC9 | src/models.py reduced to <100 lines (only Pydantic proxies + AGENT_TOOL_NAMES) |
wc -l src/models.py returns < 100 |
| VC10 | All 7 audit gates pass --strict |
same as current baseline |
Scope summary
| Operation | Files affected | Net change |
|---|---|---|
| DELETE | 7 (5 ImGui + 2 vendor) | -7 files |
| CREATE | 3 (mma.py, project.py, project_files.py) | +3 files |
| MODIFY | 7 (ai_client.py, gui_2.py, personas.py, tool_presets.py, tool_bias.py, external_editor.py, mcp_client.py, workspace_manager.py) + reduce models.py | 8 files modified |
| TOTAL | 17 file changes; net -4 files | -4 files |
Before: 65 files in src/
After: 61 files in src/ (with cleaner taxonomy)
Open question: rename existing files for prefix consistency?
The user said "top-level prefix for modules that cannot have their definitions in the single file". Renames are NOT required (the user wants minimal splitting). But for naming consistency, some renames MIGHT be considered:
| Current name | Suggested rename | Reason |
|---|---|---|
mma_prompts.py |
(keep) | Already prefixed |
multi_agent_conductor.py |
mma_conductor.py |
For consistency with mma_prompts.py |
dag_engine.py |
mma_dag.py |
Same |
conductor_tech_lead.py |
mma_tech_lead.py |
Same |
orchestrator_pm.py |
mma_pm.py |
Same |
events.py |
(keep) | Generic, not MMA-specific |
gemini_cli_adapter.py |
(keep) | Per user: don't split ai_client.py; the adapter is its own concern |
qwen_adapter.py |
(keep) | Same |
mcp_tool_specs.py |
(keep) | Already prefixed |
beads_client.py |
(keep) | Beads is its own concern (separate from MCP client) |
Recommendation: do the renames as a SEPARATE phase if desired. They improve clarity but are not strictly necessary. The user's main complaint is the dumping-ground problem (models.py), not naming convention.
Recommendation
Execute the 5-phase refactor. The 3 substantive phases (Phase 1 ImGui merge, Phase 2 vendor merge, Phase 3 models split) are all justified. The renaming in the Open Question section is OPTIONAL — defer to a follow-up if the user wants.
The user should approve this plan before Tier 2/3 starts executing. The plan is conservative: only moves that have a clear "good reason" per the user's principle. Everything else stays put.
See also
docs/reports/FOLLOWUP_module_taxonomy_20260627.md— previous taxonomy discussion (this document is the revised version)AGENTS.md— "File Size and Naming Convention" HARD RULEconductor/code_styleguides/data_oriented_design.md— "Prefer Fewer Types" principlesrc/models.py— current 1044-line dumping groundconductor/tracks/cruft_elimination_20260627/SPEC_CORRECTION_phase_2.md— Phase 2 spec correction (related to project.py refactor)