docs(track): Add continuation guide for entropy audit
This commit is contained in:
@@ -0,0 +1,232 @@
|
||||
# Entropy Audit Continuation Guide
|
||||
|
||||
**Session:** 2026-05-06
|
||||
**Track:** data_oriented_optimization_20260312
|
||||
**Context Used:** ~77%
|
||||
**Commit:** 2b5185a
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
Phase 5 (Entropy Audit & Reduction) was partially completed. We focused on **actual bugs and performance issues** (Muratori-style) rather than style preferences. Long functions are OK if they're linear and single-purpose.
|
||||
|
||||
### Fixed This Session:
|
||||
|
||||
1. **GUI crash bug** - indentation error in `_render_mma_dashboard` (f6feab9)
|
||||
2. **Duplicate line bug** - `rag_emb_provider.setter` had two identical lines (f6feab9)
|
||||
3. **Nested imports in hot paths** - hoisted to module level for performance (2b5185a)
|
||||
|
||||
### Identified But Not Fixed (Design Issues):
|
||||
|
||||
1. **Parallel ticket state** - Dict-based `active_tickets` vs `Ticket` objects in DAG
|
||||
2. **Duplicate blocking logic** - GUI has manual block/unblock, DAG has `cascade_blocks`
|
||||
|
||||
These are architectural trade-offs that would require significant refactoring.
|
||||
|
||||
---
|
||||
|
||||
## Files Modified This Session
|
||||
|
||||
| File | Size | Changes |
|
||||
|------|------|---------|
|
||||
| src/gui_2.py | 224KB | +traceback import, fixed indentation crash, removed nested traceback |
|
||||
| src/app_controller.py | 133KB | +traceback, +inspect imports, removed 3 nested traceback imports |
|
||||
| src/multi_agent_conductor.py | 23KB | Removed unused `import sys`, removed redundant nested imports |
|
||||
| src/dag_engine.py | 7KB | No changes (reference for blocking logic) |
|
||||
|
||||
---
|
||||
|
||||
## Audit Scripts Created
|
||||
|
||||
### 1. `scripts/focused_entropy_audit.py` (RECOMMENDED)
|
||||
Muratori-style audit - focuses on actual issues:
|
||||
- Duplicate logic
|
||||
- State inconsistencies
|
||||
- Logic errors
|
||||
- Performance concerns (nested imports)
|
||||
- Ignores style preferences (long functions, magic numbers as tunables)
|
||||
|
||||
**Run:** `uv run python scripts/focused_entropy_audit.py`
|
||||
|
||||
### 2. `scripts/comprehensive_entropy_audit.py`
|
||||
Full-spectrum analysis:
|
||||
- Long functions (>200 lines)
|
||||
- Magic numbers (3+ digits)
|
||||
- TODO/FIXME comments
|
||||
- Deep nesting (>20 spaces)
|
||||
- Duplicate consecutive lines
|
||||
- Nested imports
|
||||
|
||||
**Run:** `uv run python scripts/comprehensive_entropy_audit.py`
|
||||
|
||||
---
|
||||
|
||||
## Files NOT Yet Audited (Remaining Work)
|
||||
|
||||
### Large Files Requiring Deep Dive:
|
||||
|
||||
| File | Size | Lines | Notes |
|
||||
|------|------|-------|-------|
|
||||
| gui_2.py | 224KB | ~4800 | Main GUI, many UI panels |
|
||||
| app_controller.py | 133KB | ~3200 | Headless controller |
|
||||
| ai_client.py | 100KB | ~1900 | Multi-provider AI client |
|
||||
| mcp_client.py | 69KB | ~2000 | MCP tools implementation |
|
||||
| api_hooks.py | 31KB | ~650 | REST API hooks |
|
||||
| models.py | 20KB | ~600 | Data classes |
|
||||
| file_cache.py | 26KB | ~800 | AST parsing |
|
||||
|
||||
### Medium Files:
|
||||
|
||||
| File | Size | Lines |
|
||||
|------|------|-------|
|
||||
| theme_2.py | 18KB | ~550 |
|
||||
| theme.py | 16KB | ~500 |
|
||||
| project_manager.py | 18KB | ~550 |
|
||||
| aggregate.py | 17KB | ~520 |
|
||||
| log_registry.py | 11KB | ~350 |
|
||||
|
||||
### Small Files (<300 lines):
|
||||
|
||||
`beads_client.py`, `bg_shader.py`, `conductor_tech_lead.py`, `cost_tracker.py`, `diff_viewer.py`, `events.py`, `gemini_cli_adapter.py`, `history.py`, `log_pruner.py`, `markdown_helper.py`, `mma_prompts.py`, `native_orchestrator.py`, `orchestrator_pm.py`, `outline_tool.py`, `patch_modal.py`, `paths.py`, `performance_monitor.py`, `personas.py`, `presets.py`, `rag_engine.py`, `session_logger.py`, `shader_manager.py`, `shaders.py`, `shell_runner.py`, `summarize.py`, `summary_cache.py`, `synthesis_formatter.py`, `theme_nerv.py`, `theme_nerv_fx.py`, `thinking_parser.py`, `tool_bias.py`, `tool_presets.py`, `workspace_manager.py`
|
||||
|
||||
---
|
||||
|
||||
## Specific Areas Needing Attention
|
||||
|
||||
### 1. ai_client.py (100KB, ~1900 lines)
|
||||
**Potential issues:**
|
||||
- Long functions (`_send_gemini` 229 lines, `_send_deepseek` 251 lines, `_send_minimax` 216 lines)
|
||||
- Nested imports?
|
||||
- Duplicate provider handling patterns
|
||||
|
||||
**Key patterns to find:**
|
||||
- `def _send_` - provider-specific methods
|
||||
- `def send` - main entry point (has 12 parameters!)
|
||||
- `from google.` / `from anthropic` / `from deepseek` - SDK imports
|
||||
|
||||
### 2. mcp_client.py (69KB, ~2000 lines)
|
||||
**Potential issues:**
|
||||
- 26 tool implementations that might have similar structure
|
||||
- Nested imports for file_cache, paths, etc.
|
||||
|
||||
**Key patterns to find:**
|
||||
- `def dispatch` - main tool dispatcher
|
||||
- `def _get_symbol_node` - AST utilities
|
||||
- `class StdioMCPServer` / `class ExternalMCPManager` - server management
|
||||
|
||||
### 3. api_hooks.py (31KB, ~650 lines)
|
||||
**Potential issues:**
|
||||
- `do_GET` (205 lines) and `do_POST` (350 lines) - long but likely linear
|
||||
- State management via `app_state`
|
||||
|
||||
**Key patterns to find:**
|
||||
- `def do_GET` / `def do_POST` - endpoint handlers
|
||||
- `app_state` usage - global state access
|
||||
|
||||
### 4. file_cache.py (26KB, ~800 lines)
|
||||
**Potential issues:**
|
||||
- AST parsing for Python, C, C++
|
||||
- Tree-sitter integration
|
||||
|
||||
**Key patterns to find:**
|
||||
- `class ASTParser` - main parser class
|
||||
- `def get_curated_view` / `def get_targeted_view` - skeleton generation
|
||||
|
||||
---
|
||||
|
||||
## Duplicate Patterns Identified (Not Bugs - By Design)
|
||||
|
||||
These patterns appear in multiple files because they're used across the codebase:
|
||||
|
||||
| Pattern | Files | Purpose |
|
||||
|---------|-------|---------|
|
||||
| `calculate_track_progress` | gui_2.py, project_manager.py | Progress calculation |
|
||||
| `topological_sort` | app_controller.py, conductor_tech_lead.py, dag_engine.py | Dependency ordering |
|
||||
| `push_mma_state` | app_controller.py, gui_2.py | State updates |
|
||||
| `active_tickets` | api_hooks.py, app_controller.py, gui_2.py | Ticket list access |
|
||||
|
||||
---
|
||||
|
||||
## Recommendations for Continuing
|
||||
|
||||
### High Priority:
|
||||
1. **Deep audit ai_client.py** - verify no duplicate provider logic
|
||||
2. **Check mcp_client.py tool implementations** - 26 tools might have copy-paste patterns
|
||||
3. **Verify api_hooks state management** - `app_state` usage patterns
|
||||
|
||||
### Medium Priority:
|
||||
4. **Review file_cache.py AST handling** - ensure tree-sitter usage is efficient
|
||||
5. **Check models.py dataclasses** - verify no duplicate serialization logic
|
||||
6. **Audit theme*.py files** - three theme files (theme.py, theme_2.py, theme_nerv.py, theme_nerv_fx.py) might have overlap
|
||||
|
||||
### Low Priority (Cosmetic Only):
|
||||
- Mixed indentation in various files (4-space blocks in 1-space files)
|
||||
- Import consolidation patterns
|
||||
|
||||
---
|
||||
|
||||
## Testing Commands
|
||||
|
||||
```powershell
|
||||
# Run core tests
|
||||
uv run pytest tests/test_dag_engine.py tests/test_execution_engine.py tests/test_performance_monitor.py tests/test_aggregate_flags.py tests/test_tiered_aggregation.py -v --timeout=60
|
||||
|
||||
# Run focused entropy audit
|
||||
uv run python scripts/focused_entropy_audit.py
|
||||
|
||||
# Run comprehensive entropy audit
|
||||
uv run python scripts/comprehensive_entropy_audit.py
|
||||
|
||||
# Verify syntax on modified files
|
||||
python -c "import ast; ast.parse(open('src/gui_2.py', encoding='utf-8').read()); print('gui_2.py OK')"
|
||||
python -c "import ast; ast.parse(open('src/app_controller.py', encoding='utf-8').read()); print('app_controller.py OK')"
|
||||
python -c "import ast; ast.parse(open('src/multi_agent_conductor.py', encoding='utf-8').read()); print('multi_agent_conductor.py OK')"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Commits in This Track
|
||||
|
||||
| Commit | Description |
|
||||
|--------|-------------|
|
||||
| 2b5185a | perf(entropy): Fix nested imports in hot paths |
|
||||
| 54afbb9 | chore(entropy): Phase 5 start - fix duplicate line bug and document findings |
|
||||
| f6feab9 | fix(gui): Correct indentation bug in _render_mma_dashboard that caused crash |
|
||||
| 5c9948d | conductor(plan): Track complete |
|
||||
|
||||
Track history: `git log --oneline f6feab9..HEAD`
|
||||
|
||||
---
|
||||
|
||||
## Architecture Notes
|
||||
|
||||
### Ticket State Split:
|
||||
```
|
||||
gui_2.py (UI) -> active_tickets: List[Dict[str, Any]]
|
||||
app_controller.py -> active_tickets: List[Dict[str, Any]]
|
||||
dag_engine.py (Core) -> tickets: List[Ticket] (dataclass)
|
||||
```
|
||||
|
||||
This is a design trade-off. Dict-based for GUI table binding flexibility, typed objects for DAG operations.
|
||||
|
||||
### Blocking Logic Split:
|
||||
```
|
||||
gui_2.py: _cb_block_ticket(), _cb_unblock_ticket() - manual while loops
|
||||
dag_engine.py: cascade_blocks() - transitive propagation
|
||||
```
|
||||
|
||||
Potential state divergence if not synchronized properly.
|
||||
|
||||
---
|
||||
|
||||
## Next Session Checklist
|
||||
|
||||
- [ ] Deep audit ai_client.py for duplicate provider patterns
|
||||
- [ ] Review mcp_client.py tool implementations (26 tools)
|
||||
- [ ] Check api_hooks.py state management
|
||||
- [ ] Verify file_cache.py AST handling efficiency
|
||||
- [ ] Review models.py serialization consistency
|
||||
- [ ] Audit theme files for overlap
|
||||
- [ ] Run full test suite to verify no regressions
|
||||
- [ ] Update plan.md with Phase 5 status
|
||||
Reference in New Issue
Block a user