Re-audit after reading the actual full file contents:
1. guide_app_controller.md (the __init__ walkthrough):
- '4-thread ThreadPoolExecutor' -> '8-thread' per IO_POOL_MAX_WORKERS = 8
in src/io_pool.py:20 (bumped from 4 in commit 4a338486; the io_pool.py
module docstring is also stale and says '4 worker threads' - flagged
for a separate fix).
- '12 locks' -> '11 locks + 5 non-lock state fields' (re-counted the
threading.Lock() and the _rag_sync_*/_project_switch_* fields).
2. guide_app_controller.md (the closing line):
- '12 locks' -> removed; explained the 434-line __init__ body
composition (locks + state fields + settable_fields + gui_task_handlers).
3. guide_rag.md (Future Work section):
- 'The _search_mcp method is a placeholder for this' -> WRONG.
_search_mcp (src/rag_engine.py:322) IS a real implementation that
calls mcp_client.async_dispatch when vector_store.provider == 'mcp'.
Rewrote the future-work item to describe the actual mechanism.
4. docs/reports/docs_sync_test_era_20260610.md (the closing report):
- Same 4-thread->8 and 12-locks->11 corrections propagated.
The structural facts (WorkspaceProfile/RAGConfig/VectorStoreConfig field
lists, method existence, _init_actions/_load_active_project line
numbers, _LiveGuiHandle existence, etc.) were all correct. The
counting/threading-pool claims I cited from memory were the ones
that needed re-verification.
10 KiB
Test-Era Docs Sync — Closing Report (2026-06-10)
Track: docs_sync_test_era_20260610
Date: 2026-06-10
Status: COMPLETE — all 4 phases shipped, 0 new audit violations, 17 atomic commits
Summary
End-state cleanup of the 4-day test-hell saga (regression_fixes → test_infrastructure_hardening → mma_tier_usage_reset_fix → rag_phase4_sync_fix → workspace_path_finalize) plus a full docs sync against the git diff baseline f93dac7d (2026-06-02 comprehensive docs refresh). Result: 11 doc files with drift fixed, 4 tracks properly archived, 4 lessons placed in durable locations. The next Tier 2 agent engaging qwen_llama_grok_integration_20260606 has pristine context to read.
Commits (17 atomic, in chronological order)
Phase 1: Doc drift fixes (11 commits, 11 doc files)
d82153c0docs(models): sync WorkspaceProfile dataclass to 4-field model7f58f980docs(readme): fix WorkspaceProfile description + gui_2 line refsf973fb27docs(workspace_profiles): fix WorkspaceProfile schema5aa19e59docs(rag): sync with src/rag_engine.py (collection attr, chroma path, dim validation)c5010356docs(gui_2): getattr hasattr-guard + startup architecture sectionca48d33ddocs(simulations): update live_gui fixture signature to _LiveGuiHandle07c1ed49docs(ai_client+api_hooks): lazy-loading + warmup endpoints (startup_speedup)5fa8a10edocs(testing): critical live_gui_workspace path fix + 8 new sections2e12b266docs(mcp_client+ai_client): correct tool counts (15→18, 45→46)237f5725docs(app_controller): replace fictional init + register_hooks with real flow
Phase 2: End-state cleanup (4 commits)
1ea38ad1conductor(track): close 4 test-hell lineage tracks (state + metadata)5d262452conductor(archive): move 4 test-hell lineage tracks to archive/3945fe37conductor(tracks): archive test_infrastructure_hardening_20260609 in tracks.mdf0b7c8b7conductor(index): add Test Infrastructure Hardening to Recently Shipped
Phase 3: Lessons capture (3 commits)
01ea22fcdocs(styleguide): add chroma_cache.md — chroma DB path and cleanup pattern965e0157docs(workflow): add 3 test-hell lessons to Known Pitfalls + Live_gui Test Fragility72b23745docs(guidelines): add Testing Requirements section with 4 standards
What Was Fixed (by file)
Critical fixes (~20 items)
| File | Critical Fix |
|---|---|
guide_workspace_profiles.md |
4 field renames: docking_layout→ini_content, window_visibility→show_windows, panel_state→panel_states; removed 3 fictional fields (theme, theme_fx_enabled, captured_at, description); updated TOML example |
guide_models.md |
WorkspaceProfile class + removed fictional LayoutPreset |
guide_rag.md |
Chroma path .rag/chroma/→.slop_cache/chroma_<name>/; self.vector_store→self.collection; vector_store_backend→vector_store.provider; new VectorStoreConfig nested dataclass; new §Dimension Mismatch Protection |
guide_gui_2.md |
__getattr__ code example updated to bcdc26d0 fixed version (with hasattr guard); new §Startup Architecture section |
guide_simulations.md |
live_gui fixture signature Generator[tuple[...], ...]→Generator["_LiveGuiHandle", ...]; new xdist coordination paragraph |
guide_ai_client.md |
New §Module-Level Imports explaining _require_warmed lazy-loading pattern |
guide_api_hooks.md |
4 new warmup endpoints added (/api/warmup_status, /api/warmup_wait, /api/warmup_canaries, /api/startup_timeline); new §Warmup API section |
guide_testing.md |
CRITICAL: tmp_path_factory (banned) → tests/artifacts/live_gui_workspace_<timestamp> (per-run) for live_gui_workspace fixture; 8 new sections (Watchdog, Chroma Cache, xdist, Dependencies Gate, MMA/RAG reset_session, etc.) |
guide_mcp_client.md |
Tool count 45→46, Python AST 15→18; added 4 structural mutator tools (py_remove_def, py_add_def, py_move_def, py_region_wrap) |
guide_app_controller.md |
Fictional AppState dataclass + register_hooks method + enable_test_hooks param removed; real __init__ flow documented (timeline anchors, 11 locks + 5 non-lock state fields, GUI health state, 8-thread io_pool, warmup manager) |
Readme.md |
WorkspaceProfile description + guide_gui_2 line refs updated |
End-state cleanup (4 tracks archived)
test_infrastructure_hardening_20260609→conductor/archive/.state.toml: status active→completed, last_updated 2026-06-09→2026-06-10, all 12 t7_/t8_ tasks marked complete with commit SHAs.metadata.json: status spec→shipped. 8 phases, 60+ tasks, 314/314 tests green.mma_tier_usage_reset_fix_20260610→conductor/archive/.metadata.json: status spec→shipped. 4 controller bug fixes (mma_tier_usage pre-population, _flush_to_project defensive get, context_preset_manager init, persona_manager getattr fix).rag_phase4_sync_fix_20260610→conductor/archive/.metadata.json: status spec→shipped. 4-part RAG root cause fix (rag_config reset to default RAGConfig, not None; assertion accepts either file's content; entry polling race; chroma cache cleanup).workspace_path_finalize_20260609→conductor/archive/.state.toml: status active→completed, current_phase 1→complete, all 6 tasks marked complete (c725270b,93ec2809).metadata.json: status spec→shipped.
tracks.md and index.md updates
- Row 1 of Active Tracks table removed (Test Infrastructure Hardening is no longer active)
- Rows 2-5, 17:
test_infrastructure_hardening_20260609→(merged) - Phase 6+ "Test Infrastructure Hardening" entry marked
[COMPLETE 2026-06-10] [archived], link updated to./archive/test_infrastructure_hardening_20260609/ conductor/index.md"Recently Shipped" gets a new top entry linking to the archive + closing report
Lessons capture (4 lessons placed in durable locations)
| Lesson | Destination |
|---|---|
| 1. Isolated-Pass Verification Fallacy | conductor/product-guidelines.md §Testing Requirements (new) + cross-link to conductor/workflow.md §Isolated-Pass Verification Fallacy (existed) + AGENTS.md (existed) |
2. HARD BAN on git checkout -- <file> / git restore / git reset |
conductor/workflow.md §Known Pitfalls (new subsection) + cross-link to AGENTS.md (existed) |
3. push_event + time.sleep(N) + assert race |
conductor/workflow.md §Live_gui Test Fragility (new subsection) + cross-link to docs/guide_testing.md §Authoring Robust live_gui Tests (existed) |
| 4. Production diag logging must be removed | No change — already in AGENTS.md + workflow.md |
5. Chroma cache lives at tests/artifacts/.slop_cache/ |
NEW conductor/code_styleguides/chroma_cache.md |
| 6. Async setters need poll-for-state | conductor/workflow.md §Live_gui Test Fragility (new subsection) + cross-link to docs/guide_testing.md §MMA and RAG State in reset_session() (new in this track) |
Verification
Audit scripts (all 4 pass; no new violations)
scripts/check_test_toml_paths.py— 9 pre-existing false-positives in test mock content (not from this track; the audit script flags string literals containing'tests/artifacts/...'in mock setup). No new violations.scripts/audit_main_thread_imports.py—OK: 15 files in main-thread import graph; no heavy top-level imports.scripts/audit_weak_types.py— pre-existing weak types insrc/log_registry.py(7 findings). No new violations from doc changes (this track is docs-only, nosrc/modifications).scripts/audit_no_models_config_io.py—OK - no violations found.
Path verification
conductor/archive/test_infrastructure_hardening_20260609/spec.md✓conductor/archive/mma_tier_usage_reset_fix_20260610/spec.md✓conductor/archive/rag_phase4_sync_fix_20260610/spec.md✓conductor/code_styleguides/chroma_cache.md✓ (new)
Cross-link verification (spot-check)
tracks.md→./archive/test_infrastructure_hardening_20260609/✓ (path resolves)index.md→./archive/test_infrastructure_hardening_20260609/✓docs/Readme.md→guide_gui_2.mdupdated line refs ✓- All other
guide_*.mdcross-links unchanged (no new cross-links added; only existing ones updated)
Out of Scope (deferred to next agent)
- Other "Active" tracks (manual_ux_validation_20260608, ui_polish_five_issues, gencpp_dogfood_feedback_20260510, etc.) — not test-hell lineage
- Migrating any source code
- Creating new audit scripts
qwen_llama_grokplanning — separate session- The 9 pre-existing
check_test_toml_paths.pyfalse-positives in test mock content - The 7 pre-existing weak-type findings in
src/log_registry.py
What the Next Tier 2 Will See
When the next agent engages qwen_llama_grok_integration_20260606:
conductor/tracks.mdis clean: qwen is the top of the Active table withtest_infrastructure_hardening_20260609 (merged)in the Blocked By columndocs/guide_rag.mddocuments the actual chroma path (no misleading.rag/chroma/)docs/guide_testing.mdhas all 8 new sections they need to write robust live_gui testsdocs/guide_gui_2.mdhas the Startup Architecture section explaining warmup/lazy importsdocs/guide_app_controller.mdhas the real (not fictional)__init__flowdocs/guide_api_hooks.mdhas the 4 warmup endpoints + client methodsdocs/Readme.mdanddocs/guide_workspace_profiles.mdreflect the 4-field WorkspaceProfile modelconductor/code_styleguides/chroma_cache.mdexists for any chroma-touching codeconductor/code_styleguides/workspace_paths.mdexists for test workspace pathsconductor/workflow.mdhas the 3 new lessons (HARD BAN, time.sleep race, async setters)conductor/product-guidelines.mdhas the new Testing Requirements section
The next agent can read any of these docs and trust they're current as of 2026-06-10.
See Also
- test_infrastructure_hardening_batch_green_20260610.md — the closing report for the test-hell saga
- test_bed_health_20260609.md — the test bed health summary (Phase 7 of test_infrastructure_hardening)
- agile_dispatch_20260610.md — the session diary (if present)