Private

Public Access

Files

T

ed bb1aa3e03c docs: fix 3 more unverified claims (4-thread->8, 12 locks->11, _search_mcp real)

Re-audit after reading the actual full file contents:

1. guide_app_controller.md (the __init__ walkthrough):
   - '4-thread ThreadPoolExecutor' -> '8-thread' per IO_POOL_MAX_WORKERS = 8
     in src/io_pool.py:20 (bumped from 4 in commit 4a338486; the io_pool.py
     module docstring is also stale and says '4 worker threads' - flagged
     for a separate fix).
   - '12 locks' -> '11 locks + 5 non-lock state fields' (re-counted the
     threading.Lock() and the _rag_sync_*/_project_switch_* fields).

2. guide_app_controller.md (the closing line):
   - '12 locks' -> removed; explained the 434-line __init__ body
     composition (locks + state fields + settable_fields + gui_task_handlers).

3. guide_rag.md (Future Work section):
   - 'The _search_mcp method is a placeholder for this' -> WRONG.
     _search_mcp (src/rag_engine.py:322) IS a real implementation that
     calls mcp_client.async_dispatch when vector_store.provider == 'mcp'.
     Rewrote the future-work item to describe the actual mechanism.

4. docs/reports/docs_sync_test_era_20260610.md (the closing report):
   - Same 4-thread->8 and 12-locks->11 corrections propagated.

The structural facts (WorkspaceProfile/RAGConfig/VectorStoreConfig field
lists, method existence, _init_actions/_load_active_project line
numbers, _LiveGuiHandle existence, etc.) were all correct. The
counting/threading-pool claims I cited from memory were the ones
that needed re-verification.

2026-06-10 20:49:20 -04:00

10 KiB

Raw Blame History

Test-Era Docs Sync — Closing Report (2026-06-10)

Track: docs_sync_test_era_20260610 Date: 2026-06-10 Status: COMPLETE — all 4 phases shipped, 0 new audit violations, 17 atomic commits

Summary

End-state cleanup of the 4-day test-hell saga (regression_fixes → test_infrastructure_hardening → mma_tier_usage_reset_fix → rag_phase4_sync_fix → workspace_path_finalize) plus a full docs sync against the git diff baseline f93dac7d (2026-06-02 comprehensive docs refresh). Result: 11 doc files with drift fixed, 4 tracks properly archived, 4 lessons placed in durable locations. The next Tier 2 agent engaging qwen_llama_grok_integration_20260606 has pristine context to read.

Commits (17 atomic, in chronological order)

Phase 1: Doc drift fixes (11 commits, 11 doc files)

d82153c0 docs(models): sync WorkspaceProfile dataclass to 4-field model
7f58f980 docs(readme): fix WorkspaceProfile description + gui_2 line refs
f973fb27 docs(workspace_profiles): fix WorkspaceProfile schema
5aa19e59 docs(rag): sync with src/rag_engine.py (collection attr, chroma path, dim validation)
c5010356 docs(gui_2): getattr hasattr-guard + startup architecture section
ca48d33d docs(simulations): update live_gui fixture signature to _LiveGuiHandle
07c1ed49 docs(ai_client+api_hooks): lazy-loading + warmup endpoints (startup_speedup)
5fa8a10e docs(testing): critical live_gui_workspace path fix + 8 new sections
2e12b266 docs(mcp_client+ai_client): correct tool counts (15→18, 45→46)
237f5725 docs(app_controller): replace fictional init + register_hooks with real flow

Phase 2: End-state cleanup (4 commits)

1ea38ad1 conductor(track): close 4 test-hell lineage tracks (state + metadata)
5d262452 conductor(archive): move 4 test-hell lineage tracks to archive/
3945fe37 conductor(tracks): archive test_infrastructure_hardening_20260609 in tracks.md
f0b7c8b7 conductor(index): add Test Infrastructure Hardening to Recently Shipped

Phase 3: Lessons capture (3 commits)

01ea22fc docs(styleguide): add chroma_cache.md — chroma DB path and cleanup pattern
965e0157 docs(workflow): add 3 test-hell lessons to Known Pitfalls + Live_gui Test Fragility
72b23745 docs(guidelines): add Testing Requirements section with 4 standards

What Was Fixed (by file)

Critical fixes (~20 items)

File	Critical Fix
`guide_workspace_profiles.md`	4 field renames: `docking_layout`→`ini_content`, `window_visibility`→`show_windows`, `panel_state`→`panel_states`; removed 3 fictional fields (theme, theme_fx_enabled, captured_at, description); updated TOML example
`guide_models.md`	WorkspaceProfile class + removed fictional `LayoutPreset`
`guide_rag.md`	Chroma path `.rag/chroma/`→`.slop_cache/chroma_<name>/`; `self.vector_store`→`self.collection`; `vector_store_backend`→`vector_store.provider`; new `VectorStoreConfig` nested dataclass; new §Dimension Mismatch Protection
`guide_gui_2.md`	`__getattr__` code example updated to `bcdc26d0` fixed version (with `hasattr` guard); new §Startup Architecture section
`guide_simulations.md`	`live_gui` fixture signature `Generator[tuple[...], ...]`→`Generator["_LiveGuiHandle", ...]`; new xdist coordination paragraph
`guide_ai_client.md`	New §Module-Level Imports explaining `_require_warmed` lazy-loading pattern
`guide_api_hooks.md`	4 new warmup endpoints added (`/api/warmup_status`, `/api/warmup_wait`, `/api/warmup_canaries`, `/api/startup_timeline`); new §Warmup API section
`guide_testing.md`	CRITICAL: `tmp_path_factory` (banned) → `tests/artifacts/live_gui_workspace_<timestamp>` (per-run) for `live_gui_workspace` fixture; 8 new sections (Watchdog, Chroma Cache, xdist, Dependencies Gate, MMA/RAG reset_session, etc.)
`guide_mcp_client.md`	Tool count 45→46, Python AST 15→18; added 4 structural mutator tools (`py_remove_def`, `py_add_def`, `py_move_def`, `py_region_wrap`)
`guide_app_controller.md`	Fictional `AppState` dataclass + `register_hooks` method + `enable_test_hooks` param removed; real `__init__` flow documented (timeline anchors, 11 locks + 5 non-lock state fields, GUI health state, 8-thread io_pool, warmup manager)
`Readme.md`	WorkspaceProfile description + guide_gui_2 line refs updated

End-state cleanup (4 tracks archived)

test_infrastructure_hardening_20260609 → conductor/archive/. state.toml: status active→completed, last_updated 2026-06-09→2026-06-10, all 12 t7_/t8_ tasks marked complete with commit SHAs. metadata.json: status spec→shipped. 8 phases, 60+ tasks, 314/314 tests green.
mma_tier_usage_reset_fix_20260610 → conductor/archive/. metadata.json: status spec→shipped. 4 controller bug fixes (mma_tier_usage pre-population, _flush_to_project defensive get, context_preset_manager init, persona_manager getattr fix).
rag_phase4_sync_fix_20260610 → conductor/archive/. metadata.json: status spec→shipped. 4-part RAG root cause fix (rag_config reset to default RAGConfig, not None; assertion accepts either file's content; entry polling race; chroma cache cleanup).
workspace_path_finalize_20260609 → conductor/archive/. state.toml: status active→completed, current_phase 1→complete, all 6 tasks marked complete (c725270b, 93ec2809). metadata.json: status spec→shipped.

`tracks.md` and `index.md` updates

Row 1 of Active Tracks table removed (Test Infrastructure Hardening is no longer active)
Rows 2-5, 17: test_infrastructure_hardening_20260609 → (merged)
Phase 6+ "Test Infrastructure Hardening" entry marked [COMPLETE 2026-06-10] [archived], link updated to ./archive/test_infrastructure_hardening_20260609/
conductor/index.md "Recently Shipped" gets a new top entry linking to the archive + closing report

Lessons capture (4 lessons placed in durable locations)

Lesson	Destination
1. Isolated-Pass Verification Fallacy	`conductor/product-guidelines.md` §Testing Requirements (new) + cross-link to `conductor/workflow.md §Isolated-Pass Verification Fallacy` (existed) + AGENTS.md (existed)
2. HARD BAN on `git checkout -- <file>` / `git restore` / `git reset`	`conductor/workflow.md` §Known Pitfalls (new subsection) + cross-link to AGENTS.md (existed)
3. `push_event` + `time.sleep(N)` + `assert` race	`conductor/workflow.md` §Live_gui Test Fragility (new subsection) + cross-link to `docs/guide_testing.md §Authoring Robust live_gui Tests` (existed)
4. Production diag logging must be removed	No change — already in AGENTS.md + workflow.md
5. Chroma cache lives at `tests/artifacts/.slop_cache/`	NEW `conductor/code_styleguides/chroma_cache.md`
6. Async setters need poll-for-state	`conductor/workflow.md` §Live_gui Test Fragility (new subsection) + cross-link to `docs/guide_testing.md §MMA and RAG State in reset_session()` (new in this track)

Verification

Audit scripts (all 4 pass; no new violations)

scripts/check_test_toml_paths.py — 9 pre-existing false-positives in test mock content (not from this track; the audit script flags string literals containing 'tests/artifacts/...' in mock setup). No new violations.
scripts/audit_main_thread_imports.py — OK: 15 files in main-thread import graph; no heavy top-level imports.
scripts/audit_weak_types.py — pre-existing weak types in src/log_registry.py (7 findings). No new violations from doc changes (this track is docs-only, no src/ modifications).
scripts/audit_no_models_config_io.py — OK - no violations found.

Path verification

conductor/archive/test_infrastructure_hardening_20260609/spec.md ✓
conductor/archive/mma_tier_usage_reset_fix_20260610/spec.md ✓
conductor/archive/rag_phase4_sync_fix_20260610/spec.md ✓
conductor/code_styleguides/chroma_cache.md ✓ (new)

Cross-link verification (spot-check)

tracks.md → ./archive/test_infrastructure_hardening_20260609/ ✓ (path resolves)
index.md → ./archive/test_infrastructure_hardening_20260609/ ✓
docs/Readme.md → guide_gui_2.md updated line refs ✓
All other guide_*.md cross-links unchanged (no new cross-links added; only existing ones updated)

Out of Scope (deferred to next agent)

Other "Active" tracks (manual_ux_validation_20260608, ui_polish_five_issues, gencpp_dogfood_feedback_20260510, etc.) — not test-hell lineage
Migrating any source code
Creating new audit scripts
qwen_llama_grok planning — separate session
The 9 pre-existing check_test_toml_paths.py false-positives in test mock content
The 7 pre-existing weak-type findings in src/log_registry.py

What the Next Tier 2 Will See

When the next agent engages qwen_llama_grok_integration_20260606:

conductor/tracks.md is clean: qwen is the top of the Active table with test_infrastructure_hardening_20260609 (merged) in the Blocked By column
docs/guide_rag.md documents the actual chroma path (no misleading .rag/chroma/)
docs/guide_testing.md has all 8 new sections they need to write robust live_gui tests
docs/guide_gui_2.md has the Startup Architecture section explaining warmup/lazy imports
docs/guide_app_controller.md has the real (not fictional) __init__ flow
docs/guide_api_hooks.md has the 4 warmup endpoints + client methods
docs/Readme.md and docs/guide_workspace_profiles.md reflect the 4-field WorkspaceProfile model
conductor/code_styleguides/chroma_cache.md exists for any chroma-touching code
conductor/code_styleguides/workspace_paths.md exists for test workspace paths
conductor/workflow.md has the 3 new lessons (HARD BAN, time.sleep race, async setters)
conductor/product-guidelines.md has the new Testing Requirements section

The next agent can read any of these docs and trust they're current as of 2026-06-10.

10 KiB Raw Blame History