docs(track): Add continuation guide for entropy audit

This commit is contained in:
2026-05-06 20:20:56 -04:00
parent 2b5185a78f
commit f55b5d8fbc
@@ -0,0 +1,232 @@
# Entropy Audit Continuation Guide
**Session:** 2026-05-06
**Track:** data_oriented_optimization_20260312
**Context Used:** ~77%
**Commit:** 2b5185a
---
## Executive Summary
Phase 5 (Entropy Audit & Reduction) was partially completed. We focused on **actual bugs and performance issues** (Muratori-style) rather than style preferences. Long functions are OK if they're linear and single-purpose.
### Fixed This Session:
1. **GUI crash bug** - indentation error in `_render_mma_dashboard` (f6feab9)
2. **Duplicate line bug** - `rag_emb_provider.setter` had two identical lines (f6feab9)
3. **Nested imports in hot paths** - hoisted to module level for performance (2b5185a)
### Identified But Not Fixed (Design Issues):
1. **Parallel ticket state** - Dict-based `active_tickets` vs `Ticket` objects in DAG
2. **Duplicate blocking logic** - GUI has manual block/unblock, DAG has `cascade_blocks`
These are architectural trade-offs that would require significant refactoring.
---
## Files Modified This Session
| File | Size | Changes |
|------|------|---------|
| src/gui_2.py | 224KB | +traceback import, fixed indentation crash, removed nested traceback |
| src/app_controller.py | 133KB | +traceback, +inspect imports, removed 3 nested traceback imports |
| src/multi_agent_conductor.py | 23KB | Removed unused `import sys`, removed redundant nested imports |
| src/dag_engine.py | 7KB | No changes (reference for blocking logic) |
---
## Audit Scripts Created
### 1. `scripts/focused_entropy_audit.py` (RECOMMENDED)
Muratori-style audit - focuses on actual issues:
- Duplicate logic
- State inconsistencies
- Logic errors
- Performance concerns (nested imports)
- Ignores style preferences (long functions, magic numbers as tunables)
**Run:** `uv run python scripts/focused_entropy_audit.py`
### 2. `scripts/comprehensive_entropy_audit.py`
Full-spectrum analysis:
- Long functions (>200 lines)
- Magic numbers (3+ digits)
- TODO/FIXME comments
- Deep nesting (>20 spaces)
- Duplicate consecutive lines
- Nested imports
**Run:** `uv run python scripts/comprehensive_entropy_audit.py`
---
## Files NOT Yet Audited (Remaining Work)
### Large Files Requiring Deep Dive:
| File | Size | Lines | Notes |
|------|------|-------|-------|
| gui_2.py | 224KB | ~4800 | Main GUI, many UI panels |
| app_controller.py | 133KB | ~3200 | Headless controller |
| ai_client.py | 100KB | ~1900 | Multi-provider AI client |
| mcp_client.py | 69KB | ~2000 | MCP tools implementation |
| api_hooks.py | 31KB | ~650 | REST API hooks |
| models.py | 20KB | ~600 | Data classes |
| file_cache.py | 26KB | ~800 | AST parsing |
### Medium Files:
| File | Size | Lines |
|------|------|-------|
| theme_2.py | 18KB | ~550 |
| theme.py | 16KB | ~500 |
| project_manager.py | 18KB | ~550 |
| aggregate.py | 17KB | ~520 |
| log_registry.py | 11KB | ~350 |
### Small Files (<300 lines):
`beads_client.py`, `bg_shader.py`, `conductor_tech_lead.py`, `cost_tracker.py`, `diff_viewer.py`, `events.py`, `gemini_cli_adapter.py`, `history.py`, `log_pruner.py`, `markdown_helper.py`, `mma_prompts.py`, `native_orchestrator.py`, `orchestrator_pm.py`, `outline_tool.py`, `patch_modal.py`, `paths.py`, `performance_monitor.py`, `personas.py`, `presets.py`, `rag_engine.py`, `session_logger.py`, `shader_manager.py`, `shaders.py`, `shell_runner.py`, `summarize.py`, `summary_cache.py`, `synthesis_formatter.py`, `theme_nerv.py`, `theme_nerv_fx.py`, `thinking_parser.py`, `tool_bias.py`, `tool_presets.py`, `workspace_manager.py`
---
## Specific Areas Needing Attention
### 1. ai_client.py (100KB, ~1900 lines)
**Potential issues:**
- Long functions (`_send_gemini` 229 lines, `_send_deepseek` 251 lines, `_send_minimax` 216 lines)
- Nested imports?
- Duplicate provider handling patterns
**Key patterns to find:**
- `def _send_` - provider-specific methods
- `def send` - main entry point (has 12 parameters!)
- `from google.` / `from anthropic` / `from deepseek` - SDK imports
### 2. mcp_client.py (69KB, ~2000 lines)
**Potential issues:**
- 26 tool implementations that might have similar structure
- Nested imports for file_cache, paths, etc.
**Key patterns to find:**
- `def dispatch` - main tool dispatcher
- `def _get_symbol_node` - AST utilities
- `class StdioMCPServer` / `class ExternalMCPManager` - server management
### 3. api_hooks.py (31KB, ~650 lines)
**Potential issues:**
- `do_GET` (205 lines) and `do_POST` (350 lines) - long but likely linear
- State management via `app_state`
**Key patterns to find:**
- `def do_GET` / `def do_POST` - endpoint handlers
- `app_state` usage - global state access
### 4. file_cache.py (26KB, ~800 lines)
**Potential issues:**
- AST parsing for Python, C, C++
- Tree-sitter integration
**Key patterns to find:**
- `class ASTParser` - main parser class
- `def get_curated_view` / `def get_targeted_view` - skeleton generation
---
## Duplicate Patterns Identified (Not Bugs - By Design)
These patterns appear in multiple files because they're used across the codebase:
| Pattern | Files | Purpose |
|---------|-------|---------|
| `calculate_track_progress` | gui_2.py, project_manager.py | Progress calculation |
| `topological_sort` | app_controller.py, conductor_tech_lead.py, dag_engine.py | Dependency ordering |
| `push_mma_state` | app_controller.py, gui_2.py | State updates |
| `active_tickets` | api_hooks.py, app_controller.py, gui_2.py | Ticket list access |
---
## Recommendations for Continuing
### High Priority:
1. **Deep audit ai_client.py** - verify no duplicate provider logic
2. **Check mcp_client.py tool implementations** - 26 tools might have copy-paste patterns
3. **Verify api_hooks state management** - `app_state` usage patterns
### Medium Priority:
4. **Review file_cache.py AST handling** - ensure tree-sitter usage is efficient
5. **Check models.py dataclasses** - verify no duplicate serialization logic
6. **Audit theme*.py files** - three theme files (theme.py, theme_2.py, theme_nerv.py, theme_nerv_fx.py) might have overlap
### Low Priority (Cosmetic Only):
- Mixed indentation in various files (4-space blocks in 1-space files)
- Import consolidation patterns
---
## Testing Commands
```powershell
# Run core tests
uv run pytest tests/test_dag_engine.py tests/test_execution_engine.py tests/test_performance_monitor.py tests/test_aggregate_flags.py tests/test_tiered_aggregation.py -v --timeout=60
# Run focused entropy audit
uv run python scripts/focused_entropy_audit.py
# Run comprehensive entropy audit
uv run python scripts/comprehensive_entropy_audit.py
# Verify syntax on modified files
python -c "import ast; ast.parse(open('src/gui_2.py', encoding='utf-8').read()); print('gui_2.py OK')"
python -c "import ast; ast.parse(open('src/app_controller.py', encoding='utf-8').read()); print('app_controller.py OK')"
python -c "import ast; ast.parse(open('src/multi_agent_conductor.py', encoding='utf-8').read()); print('multi_agent_conductor.py OK')"
```
---
## Key Commits in This Track
| Commit | Description |
|--------|-------------|
| 2b5185a | perf(entropy): Fix nested imports in hot paths |
| 54afbb9 | chore(entropy): Phase 5 start - fix duplicate line bug and document findings |
| f6feab9 | fix(gui): Correct indentation bug in _render_mma_dashboard that caused crash |
| 5c9948d | conductor(plan): Track complete |
Track history: `git log --oneline f6feab9..HEAD`
---
## Architecture Notes
### Ticket State Split:
```
gui_2.py (UI) -> active_tickets: List[Dict[str, Any]]
app_controller.py -> active_tickets: List[Dict[str, Any]]
dag_engine.py (Core) -> tickets: List[Ticket] (dataclass)
```
This is a design trade-off. Dict-based for GUI table binding flexibility, typed objects for DAG operations.
### Blocking Logic Split:
```
gui_2.py: _cb_block_ticket(), _cb_unblock_ticket() - manual while loops
dag_engine.py: cascade_blocks() - transitive propagation
```
Potential state divergence if not synchronized properly.
---
## Next Session Checklist
- [ ] Deep audit ai_client.py for duplicate provider patterns
- [ ] Review mcp_client.py tool implementations (26 tools)
- [ ] Check api_hooks.py state management
- [ ] Verify file_cache.py AST handling efficiency
- [ ] Review models.py serialization consistency
- [ ] Audit theme files for overlap
- [ ] Run full test suite to verify no regressions
- [ ] Update plan.md with Phase 5 status