Single-session planning digest that captures:
- The 5 tracks fully specced + planned (test_batching, qwen_llama_grok,
data_oriented_error_handling, data_structure_strengthening,
mcp_architecture_refactor)
- Cross-cutting design themes (data-oriented, audit-driven, per-track
commit + git note, out-of-scope-by-default)
- The audit + data foundation (scripts/audit_weak_types.py; 430 -> 60
finding; 0 strong patterns; 26 unique type strings; 86% concentrated
in 6 files)
- The dependency graph + recommended execution order
- Follow-up tracks already planned in spec §12.1 of each track
- Recommended future tracks (post-tracks documentation is the top pick)
- Risks, open questions, and a complete file index
This is the kind of reference document that:
- Future planners consult to understand the codebase's current state
- The implementing agent uses to coordinate across tracks
- The user reviews as a digest of the planning work
Written in the project's docs/reports/ directory alongside the existing
Phase 5 reports (PHASE5_STABILISATION_REPORT.md, MUTATION_MATRIX_PHASE5.md, etc.).
Phase 1, Tasks T1.2 + T1.4 of the startup_speedup_20260606 track.
NEW: scripts/audit_main_thread_imports.py
Static CI gate that AST-walks the import graph reachable from
sloppy.py and fails (exit 1) if any heavy module is imported at the
top of a main-thread-reachable file. Walks into if/elif/else and
try/except branches (which run at import time) but skips function
bodies (which only run when called). Allowlist: stdlib + the lean
gui_2 skeleton (imgui_bundle, defer, src.imgui_scopes, src.theme_2,
src.theme_models, src.paths, src.models, src.events).
NEW: scripts/audit_gui2_imports.py
Read-only analysis tool that lists every top-level and function-level
import in src/gui_2.py, classified by location. Used in Phase 5D to
identify which imports to remove.
NEW: tests/test_audit_main_thread_imports.py
9 tests covering: --help exits 0, clean stdlib-only passes, heavy
third-party fails, google.genai fails, transitive walks, function-
body imports ignored, if-branch imports flagged, try-block imports
flagged, file:line reported. All 9 pass.
NEW: docs/reports/startup_baseline_20260606.txt
3-run median cold-start benchmark. Worst offenders: src.gui_2
(1770ms), simulation.user_agent (1517ms), google.genai (1001ms),
openai (482ms), anthropic (441ms), imgui_bundle (255ms),
src.theme_nerv* (485ms combined), src.markdown_table (243ms),
src.command_palette (242ms).
NEW: docs/reports/startup_audit_20260606.txt
Audit output on the CURRENT codebase. Reports 67 violations across
the main-thread import graph (incl. numpy in src/gui_2.py:9,
tomli_w in src/gui_2.py:18, fastapi + requests in src/app_controller,
tree_sitter_* in src/file_cache, pydantic in src/models, plus all
the src.* subsystem imports that drag in heavy transitive deps).
Phase 3-5 of the track will resolve these one by one.
After Phase 3-5, this audit must exit 0 (no violations).
Co-located reports in docs/reports/ per project convention; the other
agent finished their work in docs/superpowers/ and is unrelated.
Comprehensive guide covering the 251-test-file suite:
- Test file layout and naming conventions
- 7 conftest.py fixtures (isolate_workspace, reset_paths, reset_ai_client, vlogger, kill_process_tree, mock_app, app_instance, live_gui) with their mechanisms
- 5 test categories (unit, integration, mock app, headless, opt-in)
- Markers (integration, clean_install, docker) and how to filter by them
- Hook API for integration tests (ApiHookClient methods, predefined_callbacks pattern)
- Common test patterns (pure function, mock, live_gui, exception, parametrized)
- Test configuration in pyproject.toml
- Running tests (all, by file, by marker, with timeout, etc.)
- Adding new tests (pure, integration, opt-in)
- Debugging failed tests (common failure modes and fixes)
- The check_test_toml_paths.py audit script
- Test data flow diagram
- Updated to reflect 13 tests (6 unit + 7 live_gui) instead of hypothetical async test
- Removed Everything mode and async context preview sections (not yet implemented; marked as future work)
- Updated Commands Registry section to reference actual src/commands.py file
- Added Implementation section with file layout and Command/CommandRegistry/CommandModal reference
- Added Built-in Commands table reflecting the actual 11 commands shipped
- Added Adding Custom Commands section with decorator and explicit-Command patterns
- Added Keyboard Reference table
- Updated Testing section with accurate coverage and test pattern
- Moved unimplemented features (Everything mode, user-defined commands, plugin system) to Future Work