Gitea (and any case-sensitive filesystem) was rendering the [Top]
nav links in /docs as broken because of two bugs:
1. Case-sensitivity: 22 links used '../README.md' (all-uppercase)
but the actual file is 'docs/Readme.md' (capital R, lowercase
rest). 21 guide_*.md nav bars were affected, plus 1 internal
cross-link in Readme.md itself. Works on Windows (case-
insensitive) but broken on Linux/Gitea.
Fix: 22 occurrences across 22 files changed
'../README.md' -> '../Readme.md'
2. Wrong relative-path level: 16 links used '../../conductor/...'
from 'docs/guide_*.md' to reach 'conductor/'. This goes up 2
levels to 'projects/', which doesn't exist. The correct path
from 'docs/guide_*.md' to 'conductor/' is 1 level up
('../conductor/...'). 12 unique patterns across 10 files
affected.
Fix: 16 occurrences across 10 files changed
'../../conductor/' -> '../conductor/'
3. Bonus: 1 planned-guide link in guide_context_curation.md
referenced a never-written 'guide_context_presets.md'. The
ContextPreset schema is now fully covered in the new
'guide_context_aggregation.md' (per the 2026-06-08 docs
refresh). Fix: link target updated.
No content was changed, only link paths. 24 files, 37 link
replacements, 37 deletions.
Verification:
- All .md links in docs/ now resolve to existing files
(validated by path-resolution check from each file's directory)
- The 3 new guides from the previous docs refresh commit
(guide_discussions.md, guide_state_lifecycle.md,
guide_context_aggregation.md) had the case bug inherited from
guide_architecture.md's existing nav pattern; their top-of-file
nav bars are now correct
- The 21 pre-existing guide nav bars that had the same bug
(all 21 of them, except the 3 that used the correct case:
guide_mma.md, guide_simulations.md, guide_tools.md) are now
also fixed
- Inter-guide links (e.g. [Discussions](guide_discussions.md))
were not affected; they were always correct because both the
link text and the actual filename are lowercase
This is a docs-only fix. No code modified.
Per the docs Refresh Protocol (conductor/workflow.md), after a
reference/analysis track ships, the affected guides must be updated
to reflect new module structure or new conventions. The nagent_review
track (9cc51ca9) produced a deep-dive + 10 actionable takeaways that
named 3 documentation gaps in /docs. This commit fills them.
3 new guides (1,122 lines total):
1. guide_discussions.md (353 lines) — The Discussion system
- 23-operation matrix: A1-A7 per-entry + B1-B11 discussion-level
+ C1-C5 undo/redo
- Take naming convention (<base>_take_<n>), branching, promotion
- User-managed role list (app.disc_roles)
- Per-role filter linked to MMA persona focus
- _disc_entries_lock thread-safety contract
- Hook API session endpoints
- Persistence: _flush_to_project, _flush_disc_entries_to_project,
context_snapshot
- 9 file:line refs into gui_2.py:3770-4260 + history.py
2. guide_state_lifecycle.md (375 lines) — Undo/redo + reset + state
delegation
- HistoryManager + UISnapshot (13 captured fields, 100-snapshot
capacity, debounced change-detection at render frame)
- _handle_reset_session (clears 30+ fields, replaces project,
preserves active_project_path per the 2026-06-08 regression fix)
- App.__getattr__/__setattr__ state delegation to Controller
- 4-thread access pattern with 7 lock-protected regions
- State persistence: in-memory vs project TOML vs config TOML
- Hot-reload integration
- Hook API registries (_predefined_callbacks, _gettable_fields)
- 14 file:line refs into gui_2.py:1140-1170, history.py,
app_controller.py:3286-3356
3. guide_context_aggregation.md (394 lines) — The aggregate.py
pipeline
- 3 aggregation strategies (auto, summarize, full)
- 7 per-file view modes (full, summary, skeleton, outline,
masked, custom, none)
- Full FileItem schema (9 fields + __post_init__ normalizer)
at models.py:510-559
- ContextPreset schema and ContextPresetManager
- Tier 3 worker variant (build_tier3_context with FuzzyAnchor
re-resolution and focus-file handling)
- force_full / auto_aggregate short-circuits
- Cache strategy (static prefix + dynamic history)
- 23 file:line refs into aggregate.py:36-518 + models.py:909-937
8 existing guides cross-linked to the 3 new guides and to the
nagent_review track:
- guide_gui_2.md (+ See Also entries for discussions,
state lifecycle, context aggregation,
nagent_review report)
- guide_app_controller.md (+ See Also entries for discussions,
state lifecycle, context aggregation,
nagent_review report)
- guide_context_curation.md (+ new See Also section pointing to
context aggregation + nagent_review)
- guide_architecture.md (+ new See Also section listing all 10
guides + nagent_review report)
- guide_ai_client.md (+ See Also entries for state lifecycle,
context aggregation, nagent_review
pitfalls #2 and #4)
- guide_mma.md (+ new See Also section pointing to
context aggregation, discussions,
nagent_review report §9 + takeaways §3/§10
for SubConversationRunner priority)
- guide_models.md (+ See Also entries for context
aggregation, discussions, nagent_review
report §6 on FileItem as strongest
curation dimension)
- Readme.md (+ 3 new guide entries in the index
table, with one-line summaries)
No code modified. This is documentation only.
Why these 3 guides specifically:
- guide_discussions.md: The discussion system is the user's most
edited surface. nagent_review's report §3 enumerated 23 operations
(A1-C5) that previously existed only as scattered file:line refs
across gui_2.py. A dedicated guide makes the operation matrix
discoverable.
- guide_state_lifecycle.md: The undo/redo + reset + state delegation
machinery is architecturally load-bearing but scattered across 4
files. After nagent_review identified the provider-side history
divergence as Pitfall #4, the relationship between Manual Slop's
state and the provider's state needs explicit documentation.
- guide_context_aggregation.md: aggregate.py (518 lines) is the
most-touched module after ai_client.py but had no dedicated
guide. nagent_review confirmed it's Manual Slop's strongest
curation dimension. A dedicated guide makes the 7 view modes
and 3 strategies discoverable.
The 3 new guides total 1,122 lines and follow the existing
per-source-file deep-dive style (architectural, data-oriented,
state-management-focused).
Reference/analysis track. Produces 0 code changes.
Artifacts (conductor/tracks/nagent_review_20260608/):
- spec.md (240 lines) - track wrapper with Application/Meta-Tooling framing
- report.md (571 lines) - 14-section deep-dive; primary deliverable
- comparison_table.md (79 lines) - flat side-by-side reference
- decisions.md (286 lines) - 10 future-track candidates with priority matrix
- nagent_takeaways_20260608.md (363 lines) - 10 actionable patterns grounded
in code (file:line refs into nagent source and Manual Slop source)
- metadata.json (132 lines) - structured metadata + verification criteria
- state.toml (113 lines) - per-task tracking + user-corrections log (7 entries)
14 nagent principles covered in report.md (durable work, text-in/text-out,
editable state, visible protocol, the loop, per-file memory, repo history,
neighborhoods, sub-conversations, controlled writes, large files, tool
discovery, framework differences, build your own).
6 pitfalls (revised from 8 after user-corrections):
1. No structured output protocol in Application AI (opaque function calling)
2. Provider-specific history in process globals (ai_client._anthropic_history
+ _deepseek_history + _minimax_history)
3. RAG is not 'history as data' (fuzzy, not auditable)
4. AI client is a stateful singleton (2,685-line ai_client.py)
5. No non-MMA disposable sub-conversations (1:1 gap; user-flagged want)
6. Hard-coded tool discovery (45-tool if/elif in mcp_client.py)
User-corrections applied (3 rounds, 7 total corrections recorded):
- Editable discussions: PARTIAL -> PARITY (DIFFERENT FOCUS) with full A1-A7
per-entry + B1-B11 discussion-level + C1-C5 undo/redo operation matrix
- Per-file memory: DOMAIN MISMATCH -> MANUAL SLOP IS STRONGER IN
CURATION DIMENSION (FileItem + ContextPreset vs nagent's inode-keyed
conversation log; complementary, not equivalent)
- Sub-conversations: MMA has it; 1:1 does not -> 'PARITY for MMA; GAP for
1:1 discussions' (user wants this)
- RAG: opt-in, not gap; user wants pre-staging via sub-conversation
- Personas: config bundling (can opt out via AI settings)
- Tool discovery: deferred (user has 'intent based DSL' idea but 'no where
near that ideation yet')
10 actionable takeaways (separate from the 6 pitfalls - those are
diagnosis, these are prescription):
1. State visibility (UI inspector for in-process state)
2. Readable conversation log (text-greppable, not just JSON-L)
3. Sub-agents for 1:1 (HIGH priority - user-flagged)
4. File-identity over file-path (st_dev:st_ino rename-safe)
5. One loop shape visible in diagnostics
6. Visible retry on protocol failure
7. Meta-Tooling DSL (intent-based, deferred)
8. Self-describing tools (subsumed by mcp_architecture_refactor_20260606)
9. Single source of truth for disc_entries + provider history
10. Sub-agent return type constraint (bake into candidate #1 spec)
Domain classification: every recommendation tagged Application / Meta-Tooling
/ Both per docs/guide_meta_boundary.md. nagent lives in the Meta-Tooling
domain; Manual Slop's Application AI is a different kind of thing.
No code modified by this track (reference/analysis only). All 7 files
parse cleanly (JSON, TOML, Markdown). All internal cross-links resolve.
Track is 'active' awaiting human review; future-track candidates live in
decisions.md and nagent_takeaways_20260608.md.
The 30s wait_for_project_switch timeout was an excessive constraint.
In batch context, prior sims' AI discussion turn workers saturate the
8-worker io_pool, queueing this switch for tens of seconds. The other
defensive waits in the test (warmup 60s, prior switch 60s) already use
60s+, so 30s was the inconsistent outlier.
User confirmed: 'I think not completing in 30s is an excessive constraint
if thats whats going on.'
Verification:
- test_full_live_workflow isolation: 11.69s PASS
- 7-test batch (test_full_live_workflow + 4 extended sims + 2 markdown): 85.83s PASS
Root cause: test_full_live_workflow in batch context (with prior sims
running AI discussion turns) would queue its _do_project_switch behind
the auto-pruner's scan of tests/logs/ (154MB, 6519 files). The 4-worker
pool was saturated, so the switch would never run within 30s.
Fix: bump IO_POOL_MAX_WORKERS from 4 to 8. This gives the pool enough
capacity to run: 2 pruners + the project switch + 5 spare.
Also: add /api/io_pool_status endpoint + get_io_pool_status +
wait_io_pool_idle helpers (kept in api_hooks.py and api_hook_client.py
for the test_api_hook_client_io_pool.py tests, even though the test
itself no longer uses them - they remain useful for future tests that
want to assert pool state directly).
Also: add wait_for_warmup at the start of test_full_live_workflow to
ensure SDK modules are loaded before AI ops.
Test verification:
- test_full_live_workflow in isolation: 11.83s PASS
- test_full_live_workflow in batch (with 4 prior sims): 83.46s PASS
- 30/30 related unit tests PASS
When a prior test in the tier-3-live_gui batch leaves a _do_project_switch
background thread running, the next test's btn_project_new_automated click
sees _project_switch_in_progress=True (from the prior thread) and queues
the new path via _project_switch_pending_path. The queued switch is never
actually submitted to the io_pool, so is_project_stale() stays True and
AI ops (_handle_generate_send) bail with 'project switch in progress;
AI ops disabled'.
Fix: _handle_reset_session now also clears _project_switch_in_progress,
_project_switch_pending_path, and _project_switch_error (under the
existing _project_switch_lock). This way, even if the prior background
thread is still running, the controller reports an idle state and the
new switch can be submitted normally.
Also:
- src/api_hook_client.py: reverted wait_for_project_switch to require
in_progress=False (was relaxed to return on queued path, which misled
the caller into thinking the switch was done)
- tests/test_handle_reset_session_clears_project.py: new test
test_handle_reset_session_clears_project_switch_state asserts
is_project_stale() returns False after reset
- tests/test_api_hook_client_wait_for_project_switch.py: updated
test_wait_for_project_switch_does_not_return_on_queued (in_progress
+ matching path should keep waiting, not return early)
- tests/test_live_workflow.py: added pre-wait for any in-flight switch
before doing btn_reset (so the test waits up to 60s for the prior
switch to complete if needed)
- conductor/todos/TODO_test_full_live_workflow.md: updated Task 4 with
the deeper hang analysis and recommended fix
Known follow-up: test_full_live_workflow still hangs in tier-3 batch
even with this fix, because the new _do_project_switch itself is hung
in the io_pool (likely saturation from prior sims' AI discussion turn
workers). Deeper investigation required.
Following the conductor convention of organizing track-related
artifacts under conductor/. The TODO tracks the test_full_live_workflow
race condition fix and its follow-up items (Tasks 3, 7 still pending;
known batch hang documented).
Tasks 1, 2 (with regression fix), 4, 5, 6 are SHIPPED in prior commits.
Silences the PytestUnknownMarkWarning emitted by test_visual_mma.py and
test_visual_sim_gui_ux.py (3 instances). The @pytest.mark.live mark
already exists in the test files; pyproject.toml just didn't know
about it.
- pyproject.toml: added 'live: marks tests as live visualization tests
(not in CI by default)' to [tool.pytest.ini_options].markers
Replaces the 10x1s blind poll of derived state with a condition-based
wait on /api/project_switch_status. Also adds a defensive file existence
check that fails fast (within 5s) if the click was dropped or the
project creation handler crashed.
The new wait surfaces a clear error message ('Project switch did not
complete in 30s. Last status: ...') instead of the generic 'Project
failed to activate', and exposes _project_switch_error if the controller
reported one.
- tests/test_live_workflow.py: replaced poll loop (lines 57-65) with
wait_for_project_switch + os.path.exists defensive check
Adds a polling helper that blocks until the project switch completes,
errors out, or times out. Replaces the fragile 10x1s blind poll in
test_full_live_workflow with a condition-based wait on the
/api/project_switch_status endpoint.
Features:
- Polls /api/project_switch_status every 200ms (configurable)
- Returns immediately on error (with the error in the result)
- Path matching: exact match OR basename match (handles absolute vs relative)
- Times out with a clear 'timeout' flag instead of a generic assertion
- Optional expected_path: if None, returns on any in_progress=False
- src/api_hook_client.py: new wait_for_project_switch method (37 lines)
- tests/test_api_hook_client_wait_for_project_switch.py: 6 unit tests
with mocked _make_request covering all paths
Task 2 (_handle_reset_session reset) introduced a regression: setting self.active_project_path to empty caused an infinite re-switch loop in _do_project_switch because _flush_to_project writes to active_project_path (raises OSError on empty path), and the finally block re-submitted the failed switch on every iteration. Result: test_context_sim_live saw switching-to status for 5+ seconds and MD-only generation was blocked.
Fix: keep self.active_project_path as-is in _handle_reset_session. Only reset self.project (to a fresh default_project dict) and self.project_paths (to empty list). The stale project state issue is solved by replacing the project dict; the active_project_path stays valid for _flush_to_project.
- src/app_controller.py: refined _handle_reset_session project reset
- tests/test_handle_reset_session_clears_project.py: updated contract test to assert active_project_path is preserved
Stale project state from prior live_gui tests (shared session-scoped
subprocess) was leaking into subsequent tests, causing the
test_full_live_workflow race condition: 'Project not switched' errors
when self.project still claimed to be a different project.
The fix: _handle_reset_session now mirrors the default-project branch
of __init__ (lines 1743-1745), creating a fresh default project dict,
clearing active_project_path and project_paths, and reinitializing
the workspace manager.
- src/app_controller.py: 6 new lines in _handle_reset_session
- tests/test_handle_reset_session_clears_project.py: 3 tests
(active_project_path, project_paths, self.project)
Adds a new endpoint that exposes the project-switch state machine so tests
can poll for completion instead of guessing with timeouts.
- AppController: track _project_switch_error on failure paths
- src/api_hooks.py: GET /api/project_switch_status returns
{in_progress, pending_path, active_path, error}
- src/api_hook_client.py: get_project_switch_status() helper
- tests/test_api_hooks_project_switch.py: 3 unit tests for client + endpoint
shape, 1 live_gui test for the default-idle case