From ba05168493bb1064e2d91f5d6c0bd4c8cae387aa Mon Sep 17 00:00:00 2001 From: conductor-tier2 Date: Mon, 8 Jun 2026 19:26:08 -0400 Subject: [PATCH] docs(refresh): 3 new guides + cross-links from nagent_review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Per the docs Refresh Protocol (conductor/workflow.md), after a reference/analysis track ships, the affected guides must be updated to reflect new module structure or new conventions. The nagent_review track (9cc51ca9) produced a deep-dive + 10 actionable takeaways that named 3 documentation gaps in /docs. This commit fills them. 3 new guides (1,122 lines total): 1. guide_discussions.md (353 lines) — The Discussion system - 23-operation matrix: A1-A7 per-entry + B1-B11 discussion-level + C1-C5 undo/redo - Take naming convention (_take_), branching, promotion - User-managed role list (app.disc_roles) - Per-role filter linked to MMA persona focus - _disc_entries_lock thread-safety contract - Hook API session endpoints - Persistence: _flush_to_project, _flush_disc_entries_to_project, context_snapshot - 9 file:line refs into gui_2.py:3770-4260 + history.py 2. guide_state_lifecycle.md (375 lines) — Undo/redo + reset + state delegation - HistoryManager + UISnapshot (13 captured fields, 100-snapshot capacity, debounced change-detection at render frame) - _handle_reset_session (clears 30+ fields, replaces project, preserves active_project_path per the 2026-06-08 regression fix) - App.__getattr__/__setattr__ state delegation to Controller - 4-thread access pattern with 7 lock-protected regions - State persistence: in-memory vs project TOML vs config TOML - Hot-reload integration - Hook API registries (_predefined_callbacks, _gettable_fields) - 14 file:line refs into gui_2.py:1140-1170, history.py, app_controller.py:3286-3356 3. guide_context_aggregation.md (394 lines) — The aggregate.py pipeline - 3 aggregation strategies (auto, summarize, full) - 7 per-file view modes (full, summary, skeleton, outline, masked, custom, none) - Full FileItem schema (9 fields + __post_init__ normalizer) at models.py:510-559 - ContextPreset schema and ContextPresetManager - Tier 3 worker variant (build_tier3_context with FuzzyAnchor re-resolution and focus-file handling) - force_full / auto_aggregate short-circuits - Cache strategy (static prefix + dynamic history) - 23 file:line refs into aggregate.py:36-518 + models.py:909-937 8 existing guides cross-linked to the 3 new guides and to the nagent_review track: - guide_gui_2.md (+ See Also entries for discussions, state lifecycle, context aggregation, nagent_review report) - guide_app_controller.md (+ See Also entries for discussions, state lifecycle, context aggregation, nagent_review report) - guide_context_curation.md (+ new See Also section pointing to context aggregation + nagent_review) - guide_architecture.md (+ new See Also section listing all 10 guides + nagent_review report) - guide_ai_client.md (+ See Also entries for state lifecycle, context aggregation, nagent_review pitfalls #2 and #4) - guide_mma.md (+ new See Also section pointing to context aggregation, discussions, nagent_review report §9 + takeaways §3/§10 for SubConversationRunner priority) - guide_models.md (+ See Also entries for context aggregation, discussions, nagent_review report §6 on FileItem as strongest curation dimension) - Readme.md (+ 3 new guide entries in the index table, with one-line summaries) No code modified. This is documentation only. Why these 3 guides specifically: - guide_discussions.md: The discussion system is the user's most edited surface. nagent_review's report §3 enumerated 23 operations (A1-C5) that previously existed only as scattered file:line refs across gui_2.py. A dedicated guide makes the operation matrix discoverable. - guide_state_lifecycle.md: The undo/redo + reset + state delegation machinery is architecturally load-bearing but scattered across 4 files. After nagent_review identified the provider-side history divergence as Pitfall #4, the relationship between Manual Slop's state and the provider's state needs explicit documentation. - guide_context_aggregation.md: aggregate.py (518 lines) is the most-touched module after ai_client.py but had no dedicated guide. nagent_review confirmed it's Manual Slop's strongest curation dimension. A dedicated guide makes the 7 view modes and 3 strategies discoverable. The 3 new guides total 1,122 lines and follow the existing per-source-file deep-dive style (architectural, data-oriented, state-management-focused). --- docs/Readme.md | 3 + docs/guide_ai_client.md | 3 + docs/guide_app_controller.md | 7 +- docs/guide_architecture.md | 15 ++ docs/guide_context_aggregation.md | 394 ++++++++++++++++++++++++++++++ docs/guide_context_curation.md | 13 +- docs/guide_discussions.md | 353 ++++++++++++++++++++++++++ docs/guide_gui_2.md | 6 +- docs/guide_mma.md | 12 + docs/guide_models.md | 3 + docs/guide_state_lifecycle.md | 375 ++++++++++++++++++++++++++++ 11 files changed, 1180 insertions(+), 4 deletions(-) create mode 100644 docs/guide_context_aggregation.md create mode 100644 docs/guide_discussions.md create mode 100644 docs/guide_state_lifecycle.md diff --git a/docs/Readme.md b/docs/Readme.md index fa115e83..fd93e787 100644 --- a/docs/Readme.md +++ b/docs/Readme.md @@ -37,6 +37,9 @@ This documentation suite provides comprehensive technical reference for the Manu | [App Controller](guide_app_controller.md) | `src/app_controller.py` reference: headless orchestrator owning AppState and all subsystem managers (PresetManager, PersonaManager, ContextPresetManager, ToolPresetManager, ToolBiasEngine, RAGEngine, HistoryManager, WorkspaceManager, HookServer, HotReloader, PathManager), `_predefined_callbacks` and `_gettable_fields` registries for Hook API, SyncEventQueue bridge, preset/persona/context coordination, headless mode | | [MMA Engine](guide_multi_agent_conductor.md) | `src/multi_agent_conductor.py` + `src/dag_engine.py` reference: TrackDAG with cycle detection (iterative DFS) and topological sort (Kahn's variant), ExecutionEngine with Auto-Queue / Step Mode state machine, MultiAgentConductor with WorkerPool (configurable concurrency, default 4), mma_exec.py sub-agent invocation for Token Firewall, parse_plan_md utility, Beads mode delegation | | [Data Models](guide_models.md) | `src/models.py` reference: centralized data model registry using pydantic + dataclasses, model categories (Core, AI, Preset, Persona, Context, MMA, UI State, Logging, Hook, Workspace, RAG), `AGENT_TOOL_NAMES` canonical 45-tool list, `PROVIDERS` constant, `parse_plan_md` utility, validation patterns, SDM tags, serialization strategies (TOML, JSON-L) | +| [Discussions](guide_discussions.md) | The Discussion system: 23-operation matrix A1-A7 (per-entry) + B1-B11 (discussion-level) + C1-C5 (undo/redo), Take naming convention (`_take_`), branching at any entry (`project_manager.branch_discussion`), promotion to top-level (`project_manager.promote_take`), user-managed role list (`app.disc_roles`), per-role filter linked to MMA persona focus, `_disc_entries_lock` thread-safety contract, Hook API session endpoints | +| [State Lifecycle](guide_state_lifecycle.md) | Undo/redo via `HistoryManager` + `UISnapshot` (13 captured fields, 100-snapshot capacity, debounced change detection at render frame), reset flow (`_handle_reset_session` — clears 30+ fields, replaces project, preserves `active_project_path` per the 2026-06-08 regression fix), `App.__getattr__`/`__setattr__` state delegation to Controller, 4-thread access pattern with 7 lock-protected regions, hot-reload integration | +| [Context Aggregation](guide_context_aggregation.md) | The `aggregate.py` (518-line) pipeline: 3 aggregation strategies (`auto`/`summarize`/`full`), 7 per-file view modes (`full`/`summary`/`skeleton`/`outline`/`masked`/`custom`/`none`), full `FileItem` schema (9 fields + `__post_init__` normalizer), `ContextPreset` schema and `ContextPresetManager`, Tier 3 worker variant (`build_tier3_context` with FuzzyAnchor re-resolution and focus-file handling), `force_full`/`auto_aggregate` short-circuits, output file numbering, cache strategy (static prefix + dynamic history) | --- diff --git a/docs/guide_ai_client.md b/docs/guide_ai_client.md index f317bd47..0e267451 100644 --- a/docs/guide_ai_client.md +++ b/docs/guide_ai_client.md @@ -421,4 +421,7 @@ Gated by env var (e.g., `RUN_REAL_AI_TESTS=1`). Hits the real API. Not in defaul - **[guide_mma.md](guide_mma.md#tier-3-worker-lifecycle-run_worker_lifecycle)** — How Tier 3 workers use ai_client - **[guide_mcp_client.md](guide_mcp_client.md)** — The 45 tools that ai_client can invoke - **[guide_rag.md](guide_rag.md)** — RAG engine integration via `rag_engine` parameter +- **[guide_state_lifecycle.md](guide_state_lifecycle.md)** — The per-provider history globals (`_anthropic_history`, etc.) are managed here; their locking and reset behavior is documented +- **[guide_context_aggregation.md](guide_context_aggregation.md)** — The `aggregate.py` pipeline that produces the markdown the AI client sends - **[conductor/product.md](../../conductor/product.md#multi-provider-integration)** — Product-level overview of providers +- **[conductor/tracks/nagent_review_20260608/report.md §15 Pitfalls #2 and #4](../../conductor/tracks/nagent_review_20260608/report.md)** — Deep-dive on the per-provider history globals and the stateful singleton pattern; future-track candidate for stateless LLMClient diff --git a/docs/guide_app_controller.md b/docs/guide_app_controller.md index 0295018f..1a7dc080 100644 --- a/docs/guide_app_controller.md +++ b/docs/guide_app_controller.md @@ -1,6 +1,6 @@ # `src/app_controller.py` — Headless Orchestrator & State Hub -[Top](../README.md) | [Architecture](guide_architecture.md) | [MMA](guide_mma.md) | [Testing](guide_testing.md) +[Top](../README.md) | [Architecture](guide_architecture.md) | [Discussions](guide_discussions.md) | [State Lifecycle](guide_state_lifecycle.md) | [Context Aggregation](guide_context_aggregation.md) | [MMA](guide_mma.md) | [Testing](guide_testing.md) --- @@ -437,7 +437,9 @@ def test_apply_persona(live_gui): - **[guide_ai_client.md](guide_ai_client.md)** — How `ai_client` integrates - **[guide_api_hooks.md](guide_api_hooks.md)** — The Hook API the controller exposes - **[guide_hot_reload.md](guide_hot_reload.md)** — How the controller supports state-preserving reloads -- **[guide_history.md](guide_history.md)** — Undo/redo (planned, not yet written) +- **[guide_discussions.md](guide_discussions.md)** — The Discussion system (Takes, branching, `_switch_discussion`, `_branch_discussion`, `_rename_discussion`, `_delete_discussion`, `_flush_disc_entries_to_project`) +- **[guide_state_lifecycle.md](guide_state_lifecycle.md)** — The `_handle_reset_session` and `_handle_compress_discussion` flows, the `App.__getattr__`/`__setattr__` state delegation pattern, and the `HistoryManager` integration +- **[guide_context_aggregation.md](guide_context_aggregation.md)** — The `aggregate.py` pipeline that the controller calls per send (per-provider + Tier 3 worker) - **`src/presets.py`, `src/personas.py`, `src/context_presets.py`, `src/tool_presets.py`, `src/tool_bias.py`** — Subsystem managers - **`src/history.py`** — `HistoryManager` - **`src/rag_engine.py`** — `RAGEngine` @@ -445,3 +447,4 @@ def test_apply_persona(live_gui): - **`src/hot_reload.py`** — `HotReloader` - **`src/api_hooks.py`** — `HookServer` (uses the controller's registries) - **`src/paths.py`** — `PathManager` +- **[conductor/tracks/nagent_review_20260608/report.md](../../conductor/tracks/nagent_review_20260608/report.md)** — Deep-dive analysis of the controller's per-provider history globals and other state patterns diff --git a/docs/guide_architecture.md b/docs/guide_architecture.md index 94659e67..e34b8f74 100644 --- a/docs/guide_architecture.md +++ b/docs/guide_architecture.md @@ -987,3 +987,18 @@ def get_cached_tree(self, path: Optional[str], code: str) -> tree_sitter.Tree: _ast_cache[path] = (mtime, tree) return tree ``` + +--- + +## See Also + +- **[guide_ai_client.md](guide_ai_client.md)** — The multi-provider LLM client whose dispatch the architecture supports +- **[guide_app_controller.md](guide_app_controller.md)** — The headless orchestrator that owns all the AppController-owned state +- **[guide_mma.md](guide_mma.md)** — The 4-tier Multi-Model Architecture +- **[guide_multi_agent_conductor.md](guide_multi_agent_conductor.md)** — The `multi_agent_conductor.py` + `dag_engine.py` runtime +- **[guide_context_aggregation.md](guide_context_aggregation.md)** — The `aggregate.py` pipeline; covers the `build_tier3_context` and `build_markdown_from_items` flows referenced in this guide's "Cache Hit Strategy" +- **[guide_discussions.md](guide_discussions.md)** — The Discussion system; covers the "Discussion Compression" flow documented in this guide +- **[guide_state_lifecycle.md](guide_state_lifecycle.md)** — Undo/redo and the `App.__getattr__`/`__setattr__` state delegation pattern +- **[guide_hot_reload.md](guide_hot_reload.md)** — Hot-reload architecture; the delegation pattern documented here is what makes hot-reload possible +- **[guide_meta_boundary.md](guide_meta_boundary.md)** — The Application vs Meta-Tooling distinction +- **[conductor/tracks/nagent_review_20260608/report.md](../../conductor/tracks/nagent_review_20260608/report.md)** — Deep-dive comparison of Manual Slop's threading model to nagent's single-process loop pattern; includes the data-oriented + thread-disciplined + GUI-decoupled philosophy in §1 and §5 diff --git a/docs/guide_context_aggregation.md b/docs/guide_context_aggregation.md new file mode 100644 index 00000000..afe4aab2 --- /dev/null +++ b/docs/guide_context_aggregation.md @@ -0,0 +1,394 @@ +# Context Aggregation: How Manual Slop Builds the AI's Context + +[Top](../README.md) | [Discussions](guide_discussions.md) | [Context Curation](guide_context_curation.md) | [Models](guide_models.md) | [Architecture](guide_architecture.md) + +--- + +## Overview + +`src/aggregate.py` (518 lines) is the **context composition pipeline** — the single function that turns a project's `files` + `screenshots` + `history` config into the final markdown string the AI sees. It is called by: + +- `src/ai_client.py:_send_anthropic`, `_send_deepseek`, `_send_gemini`, `_send_gemini_cli`, `_send_minimax` (every provider) +- `src/app_controller.py:AppController._do_generate` (the main send path) +- `src/app_controller.py:AppController._cb_start_track`, `AppController._process_event_queue`, `AppController._start_track_logic` (MMA paths) +- `src/gui_2.py:App.run`, `App.main`, `App._render_snapshot_tab` (the GUI and the prior-session replay) +- `simulation/sim_base.py:run_sim` and 6 other simulation entry points + +This is one of the most-touched modules in the project. After the nagent_review, this pipeline is recognized as **Manual Slop's strongest curation dimension** (vs nagent's conversation-log dimension). See `conductor/tracks/nagent_review_20260608/report.md §6` and `decisions.md` candidate #7 for the related future-track. + +> **Domain classification.** The pipeline is **Application**-domain. The MMA sub-agents consume it but the pipeline itself does not call into Meta-Tooling code. See `guide_meta_boundary.md`. + +--- + +## The Pipeline At A Glance + +``` +aggregate.run(config, aggregation_strategy) + ├─ find_next_increment(output_dir, namespace) # next file number for output + ├─ build_file_items(base_dir, files) # read + view-mode transform + ├─ build_markdown_from_items(file_items, ...) # compose sections + │ ├─ ## Files (or Files (Summary) or Files (Tier 3 - Focused)) + │ │ └─ _build_files_section_from_items OR summarize.build_summary_markdown + │ ├─ ## Screenshots (if any) + │ ├─ ## Beads Mode: Progress Track (if execution_mode == "beads") + │ └─ ## Discussion History (if any) + └─ output_file.write_text(markdown) +``` + +The **output** is a markdown file at `{output_dir}/{namespace}_{NNN}.md` where `NNN` is a zero-padded increment. The pipeline does not *send* the markdown — that's the AI client's job. The pipeline *produces* the markdown. + +The **return value** is `(markdown: str, output_file: Path, file_items: list[dict])`. The file_items list is reused by callers that want to inspect the read state without re-reading from disk. + +--- + +## The Three Aggregation Strategies + +`aggregation_strategy: str` selects how files are rendered. The values: + +| Strategy | File rendering | History rendering | Tier 3 handling | Use case | +|---|---|---|---|---| +| `auto` | If `summary_only` is True → summary; else → full | Standard | Standard | Default. Reads `config.project.summary_only`. | +| `summarize` | Always `summarize.build_summary_markdown(file_items)` (compact multi-file view) | Standard | Standard | Token-budget-constrained runs. | +| `full` | Always `_build_files_section_from_items(file_items)` (full content) | Standard | Standard | Debugging; when you want the AI to see everything. | + +**Implementation:** `aggregate.py:330-346 build_markdown_from_items`. The three-way dispatch is at lines 335-339: + +```python +if aggregation_strategy == "summarize": parts.append("## Files (Summary)\n\n" + summarize.build_summary_markdown(file_items)) +elif aggregation_strategy == "full": parts.append("## Files\n\n" + _build_files_section_from_items(file_items)) +else: # auto + if summary_only: parts.append("## Files (Summary)\n\n" + summarize.build_summary_markdown(file_items)) + else: parts.append("## Files\n\n" + _build_files_section_from_items(file_items)) +``` + +The `auto` strategy is the *only* one that respects `config.project.summary_only`; the other two are explicit overrides. Personas can also set `aggregation_strategy` (per `guide_personas.md`), and a persona-set strategy overrides the config-level setting. + +--- + +## View Modes — The Per-File Transform + +`view_mode: str` is the per-file content transform. The value is set on the `FileItem` (or the legacy dict-shaped config entry) and determines how the file's bytes are rendered into the markdown. + +| View mode | Behavior | Source | +|---|---|---| +| `full` | Raw `path.read_text(encoding="utf-8")` content. | `aggregate.py:205` | +| `summary` | `summarize.summarise_file(path, content)` — heuristic summary from `src/summarize.py`. | `aggregate.py:210` | +| `skeleton` | For `.py`: `ASTParser("python").get_skeleton(content)` (tree-sitter). For `.c`/`.h`: `mcp_client.ts_c_get_skeleton`. For `.cpp`/`.hpp`: `mcp_client.ts_cpp_get_skeleton`. Other → summary. | `aggregate.py:211-220` | +| `outline` | For `.py`: `ASTParser("python").get_code_outline(content)`. For C/C++: `mcp_client.ts_c*_get_code_outline`. Other → summary. | `aggregate.py:221-230` | +| `masked` | For each `{symbol: mode}` in `ast_mask`, fetch `def` or `sig` via `mcp_client.py/ts_*_get_definition/signature`. Concatenate. | `aggregate.py:231-249` | +| `none` | Literal string `"(context excluded)"` — the file is in the file_items list but contributes no content. | `aggregate.py:250` | +| `custom` | Render only the `custom_slices` from the FileItem. Each slice is a `{start_line, end_line, tag, comment}` dict. Lines outside the slices are excluded. | `aggregate.py:251-266` | + +**The default view mode** is `full`. The persona can override via `Persona.aggregation_strategy`; the FileItem can override via `FileItem.view_mode` or `FileItem.force_full` (which forces `full` regardless of the FileItem's own setting). + +**Errors are graceful.** A `FileNotFoundError` produces `f"ERROR: file not found: {path}"` content with `error: True` and `mtime: 0.0`. A `view_mode` that throws produces `f"ERROR in {view_mode} view mode for {path}:\n{traceback.format_exc()}"`. Errors do not halt the pipeline. + +--- + +## The FileItem Schema (Full) + +`src/models.py:510-559 FileItem` is the **per-file curation memory** that nagent_review identified as Manual Slop's strongest dimension. The dataclass has 9 mutable fields + a `__post_init__` normalizer: + +```python +@dataclass +class FileItem: + path: str # the artifact identity (path-keyed, no inode) + auto_aggregate: bool = True # include in auto-aggregation? (skip in build_*_from_items if False) + force_full: bool = False # bypass view_mode; force raw content + view_mode: str = 'full' # one of: full, summary, skeleton, outline, masked, custom, none + selected: bool = False # for batch operations (the Context Panel multi-select) + ast_signatures: bool = False # include only signatures (skeleton-equivalent shortcut) + ast_definitions: bool = False # include only definitions (skeleton-equivalent shortcut) + ast_mask: dict[str, str] # per-symbol mask: {symbol_path: 'def'|'sig'|'hide'} (from Structural File Editor) + custom_slices: list[dict] # Fuzzy Anchor slices: {start_line, end_line, tag, comment, ...} + injected_at: Optional[float] # timestamp of last injection +``` + +The 9 fields are *all* serialized by `to_dict()` and *all* deserialized by `from_dict()` (with `.get(..., default)` for forward compatibility). The dataclass is round-trip-safe through TOML. + +`__post_init__` normalizes `custom_slices`: each slice dict gets `tag=None` and `comment=None` defaults added so downstream code can `.get("tag")` safely. + +### The Custom Slice Schema + +A `custom_slices` entry is `{start_line, end_line, tag, comment, ...}` (plus Fuzzy Anchor metadata). The full schema is in `src/fuzzy_anchor.py:FuzzyAnchor.create_slice`: + +```python +{ + "start_line": int, # 1-based original line + "end_line": int, # 1-based original line (inclusive) + "tag": str|None, # human label, defaults to None + "comment": str|None, # human comment, defaults to None + "content_hash": str, # SHA-256 of the slice content (for Fuzzy Anchor stability) + "anchor_lines": [str, ...],# surrounding context for re-resolution + # plus the original positioning metadata +} +``` + +When `view_mode == 'custom'`, the `aggregate.py:251-264` block renders each slice as: + +```markdown +--- +[Slice: ] () +Lines -: + +``` + +Multiple slices in a file are joined with `\n\n`. + +--- + +## The ContextPreset Schema + +`src/models.py:909-937 ContextPreset` is a *named, persisted set* of `FileItem`s — a reusable "context composition": + +```python +@dataclass +class ContextPreset: + name: str # the preset name (used as TOML key) + files: list[ContextFileEntry] = field(default_factory=list) + screenshots: list[str] = field(default_factory=list) + description: str = "" +``` + +`ContextFileEntry` is a `FileItem` (or a string path that's promoted to a `FileItem` on load). The `description` is a human-readable label for the preset list. + +`ContextPresetManager` (in `src/context_presets.py`, 30 lines) handles CRUD: +- `save_preset(preset: ContextPreset)` writes to `manual_slop.toml` or a project TOML +- `load_all() -> dict[str, ContextPreset]` reads all presets +- `delete_preset(name: str)` removes a preset +- `apply_preset(name: str)` switches the active context composition to the named preset + +`reload_context_presets()` (in `app_controller.py`) is called when the project TOML changes; it validates that all files in the preset still exist and warns the user about any that don't. + +**Scope:** ContextPresets can be **Global** (in `/manual_slop.toml`) or **Project-specific** (in the project's `manual_slop.toml`). Project presets override global presets of the same name. This is the same scope-inheritance pattern as Personas, Presets, and Workspace Profiles. + +--- + +## The Discussion History Section + +`aggregate.py:109 build_discussion_section(history)` is the section that includes the prior conversation: + +```python +def build_discussion_section(history: list[Any]) -> str: + sections = [] + for i, entry in enumerate(history, start=1): + if isinstance(entry, dict): + role = entry.get("role", "Unknown") + content = entry.get("content", "").strip() + text = f"{role}: {content}" + else: + text = str(entry).strip() + sections.append(f"### Discussion Excerpt {i}\n\n{text}") + return "\n\n---\n\n".join(sections) +``` + +The section handles *both* legacy `list[str]` (e.g. `["User: ...", "AI: ..."]`) and the new `list[dict]` shape (`[{"role": ..., "content": ...}, ...]`). The dict shape is what's persisted by `_flush_disc_entries_to_project` (per `app_controller.py:3225-3240`) and what's stored in the new format. + +The section is named **`## Discussion History`** and is placed at the *end* of the markdown (after files, screenshots, beads). This is deliberate: the cache-hit-friendly static prefix is at the top, the dynamic history is at the bottom. See `guide_architecture.md §"Cache Strategy"`. + +--- + +## Cache Strategy + +The pipeline is structured to maximize provider cache hits. The static prefix (Files + Screenshots + Beads) is the same across all turns of a discussion; only the Discussion History changes. The provider's cache key is the prefix; the history is appended. + +`build_markdown_no_history` (`aggregate.py:348-353`) is the explicit "static-only" builder used by `_do_generate` *before* adding the history. The full builder is `build_markdown_from_items` which adds the history if non-empty. This split allows the AI client to: + +1. Send the static prefix once. +2. Append the history to the next send without re-sending the prefix. +3. Re-use the cached prefix on the third send (if the files haven't changed). + +The cache strategy is documented in detail in `guide_ai_client.md §"Caching Strategy"` and `guide_architecture.md §"Cache Hit Strategy"`. + +--- + +## The Tier-3 Variant + +`aggregate.py:364-454 build_tier3_context` is the **MMA worker context** — a different layout for sub-agent invocations. The differences from the standard pipeline: + +1. **Focus files** (passed as `focus_files: list[str]`) are rendered as **full content** regardless of their `view_mode`. A file is a focus file if its `entry`, name, or path matches one of the focus paths. +2. **Slices are resolved via FuzzyAnchor.** If a file has `custom_slices` and the file content has been modified since the slice was created, the FuzzyAnchor re-resolves the line ranges. This is critical for sub-agents receiving slices that may be stale. +3. **Section header is `## Files (Tier 3 - Focused)`.** Distinct from the standard `## Files` so the worker (and its tools) can recognize its own context. +4. **The `is_focus` check is multi-level.** Entry match, name match, path match, and substring match. Sub-agents with looser file-matching needs can pass a focus set that's just a list of basenames. + +The Tier 3 build skips the `summarize.build_summary_markdown` path entirely; every file is rendered with `_build_files_section_from_items`-style formatting (or the AST skeleton for non-focus Python files, or the AST signature/outline for C/C++). + +The Tier 3 build is called from `multi_agent_conductor.py:run_worker_lifecycle` via `aggregate.run(config, aggregation_strategy=tier_strategy)`. + +--- + +## The Bypass — `force_full` + +`FileItem.force_full = True` short-circuits the `view_mode` selection: + +```python +if force_full: view_mode = "full" +``` + +This is set at the `FileItem` level (not the strategy level). Use case: the user has set a global "skeleton" view mode for the project but wants one specific file to always be inlined in full. The force is per-file and overrides both the FileItem's own `view_mode` and any strategy-level override. + +For Tier 3, `force_full` is treated as a *focus flag*: + +```python +if is_focus or tier == 3 or force_full: + # full content, no skeleton +``` + +So a `force_full=True` file in a Tier 3 worker context is treated as a focus file and rendered in full. + +--- + +## Auto-Aggregate Skip + +`FileItem.auto_aggregate = False` causes the file to be *included in the file_items list* but *excluded from the rendered markdown*: + +```python +for item in file_items: + if not item.get("auto_aggregate", True): continue + # ... build section +``` + +Use case: the file is in the `files` list for the AI's *awareness* (e.g. "you can read it via `read_file`") but should not be inlined. The file's `mtime` and `view_mode` are still tracked; the file is *omitted* from the rendered markdown. + +This is distinct from `view_mode == "none"`: +- `auto_aggregate = False` → file is not in the rendered markdown at all (no `### File` header) +- `view_mode = "none"` → file is in the rendered markdown as `### File (excluded)` with a `"(context excluded)"` body + +The two are useful for different scenarios. `auto_aggregate = False` is for "the AI knows the file exists, can read it on demand." `view_mode = "none"` is for "the AI knows we deliberately excluded this content." + +--- + +## Screenshots + +`aggregate.py:126-140 build_screenshots_section` renders the screenshots list as a `## Screenshots` markdown section. Each screenshot is rendered as `![name](path)` (markdown image syntax). Path resolution uses `resolve_paths` (same as for files), so wildcards and absolute paths work. + +**Screenshots are placed *after* Files and *before* Beads and Discussion History.** This is a deliberate ordering: the AI sees the project's files first (the static content), then the screenshots (the visual context), then the beads status (if applicable), then the discussion history (the dynamic content). + +--- + +## Beads Mode + +When `execution_mode == "beads"` (set in `config.project.execution_mode`), the pipeline appends a `## Beads Mode: Progress Track` section between Screenshots and Discussion History. The section is built by `aggregate.py:309-328 build_beads_section`: + +- Lists all *completed* beads as a comma-separated list +- Lists all *active* beads as bullet points with title, id, and description + +`build_beads_section` returns an empty string if the project is not a Beads project (`client.is_initialized()` is False) or if there are no beads. The caller (`build_markdown_from_items`) checks the truthiness before appending. + +See `guide_beads.md` for the full Beads integration. + +--- + +## Output File Numbering + +`find_next_increment(output_dir, namespace)` (`aggregate.py:36-44`) scans `output_dir` for files matching `^{namespace}_(\d+)\.md$` and returns `max_num + 1`. The output filename is `{namespace}_{NNN:03d}.md` (zero-padded to 3 digits). The increment starts at 1 and grows monotonically. + +The increment is the *artifact identity* for the conversation. Each turn produces a new file. The current implementation does *not* delete old files; the `LogPruner` (per `guide_architecture.md`) handles cleanup separately. + +--- + +## Pipeline Callers + +`aggregate.run` is called from many places. The most important: + +| Caller | Purpose | +|---|---| +| `src/ai_client.py:_send_anthropic` | Build the markdown for an Anthropic send. | +| `src/ai_client.py:_send_gemini` | Build the markdown for a Gemini send. | +| `src/ai_client.py:_send_deepseek` | Build the markdown for a DeepSeek send. | +| `src/ai_client.py:_send_gemini_cli` | Build the markdown for a Gemini CLI send. | +| `src/ai_client.py:_send_minimax` | Build the markdown for a MiniMax send. | +| `src/app_controller.py:AppController._do_generate` | The main 1:1 send path. | +| `src/app_controller.py:AppController._cb_start_track` | Start a new MMA track. | +| `src/app_controller.py:AppController._process_event_queue` | Process a queued event (e.g. send, switch discussion). | +| `src/multi_agent_conductor.py:run_worker_lifecycle` | Spawn a Tier 3 worker (with Tier 3 context). | +| `src/gui_2.py:App.run` | The main GUI loop. | +| `src/gui_2.py:App._render_snapshot_tab` | Render a prior-session replay snapshot. | +| `simulation/sim_base.py:run_sim` | Run a simulation. | + +The aggregation strategy is set per-call: +- The main `_do_generate` uses `config.project.aggregation_strategy` (which is the persona-set strategy if a persona is active). +- MMA worker contexts use the worker's `aggregation_strategy` from the ticket config. +- The simulation uses a fixed `auto`. + +--- + +## Public API Surface + +The public API of `aggregate.py` is: + +| Function | Signature | Purpose | +|---|---|---| +| `find_next_increment` | `(output_dir: Path, namespace: str) -> int` | Next file number for output. | +| `resolve_paths` | `(base_dir: Path, entry: str) -> list[Path]` | Expand globs and absolute paths. Blacklist `history.toml` and `*_history.toml`. | +| `group_files_by_dir` | `(files: list[Any]) -> dict[str, list[Any]]` | Group FileItems by relative directory path (used by the Context Panel UI). | +| `compute_file_stats` | `(abs_path: str) -> dict[str, int]` | Line count + AST element count for Python files. | +| `build_file_items` | `(base_dir, files) -> list[dict]` | Read + view-mode transform per file. The most-called function. | +| `build_discussion_section` | `(history) -> str` | Render the `## Discussion History` markdown. | +| `build_screenshots_section` | `(base_dir, screenshots) -> str` | Render the `## Screenshots` markdown. | +| `build_beads_section` | `(base_dir) -> str` | Render the `## Beads Mode: Progress Track` markdown. | +| `build_markdown_from_items` | `(file_items, screenshot_base_dir, screenshots, history, summary_only, aggregation_strategy, execution_mode, base_dir) -> str` | Compose all sections. The "compose" function. | +| `build_markdown_no_history` | `(file_items, screenshot_base_dir, screenshots, summary_only, aggregation_strategy) -> str` | Compose without history (for stable caching). | +| `build_discussion_text` | `(history) -> str` | Just the history section, for callers that want to append to a pre-built static prefix. | +| `build_tier3_context` | `(file_items, screenshot_base_dir, screenshots, history, focus_files) -> str` | Tier 3 worker context. | +| `build_markdown` | `(base_dir, files, screenshot_base_dir, screenshots, history, summary_only, execution_mode) -> str` | Convenience: read files + compose. | +| `run` | `(config, aggregation_strategy) -> tuple[str, Path, list[dict]]` | The full pipeline. | +| `main` | `() -> None` | CLI entry point. Loads config, calls `run`, prints output path. | + +**Performance:** the entire pipeline is O(N) in the number of files, with the per-file AST work being the most expensive step. `build_tier3_context` includes `with get_monitor().scope("build_tier3_context")` (and similar for `build_file_items` and `build_markdown_no_history`) for performance monitoring. The monitor is documented in `guide_architecture.md §"Performance"`. + +--- + +## Performance Considerations + +The `view_mode` selection has a meaningful performance impact: + +| view_mode | Per-file cost | When to use | +|---|---|---| +| `full` | 1 file read + string concat | Small files, files the user is actively editing. | +| `summary` | 1 file read + 1 heuristic call to `summarize.summarise_file` | Large files where structural info is enough. | +| `skeleton` | 1 file read + 1 tree-sitter parse + skeleton build | Python/C/C++ files where the structure matters more than the content. | +| `outline` | 1 file read + 1 tree-sitter parse + outline build | When the AI only needs the public API surface. | +| `masked` | 1 file read + N `mcp_client.py/ts_*_get_*` calls (one per masked symbol) | When the user has explicitly marked symbols as "def" or "sig". | +| `none` | 1 file read (still reads the bytes, just discards) | When the user wants the file in the list but not in the rendered markdown. | +| `custom` | 1 file read + line slicing per slice | When the user has explicitly created Fuzzy Anchor slices. | + +The `force_full = True` and `auto_aggregate = False` flags skip *some* of the work: +- `force_full = True` skips the view-mode dispatch and goes straight to raw content. +- `auto_aggregate = False` skips the view-mode dispatch entirely and skips the markdown section build. + +For very large codebases (1000+ files), the bottleneck is the tree-sitter parsing for `skeleton` / `outline` / `masked` modes. The Tier 3 builder uses `ASTParser("python")` lazily (`if not parser: parser = ASTParser("python")`) so the tree-sitter grammar is loaded only once per pipeline call. + +--- + +## Tests + +- `tests/test_aggregate_flags.py` — `test_auto_aggregate_skip`, `test_force_full`, `test_view_mode_full`, `test_view_mode_summary`, `test_view_mode_skeleton`, `test_view_mode_outline`, `test_view_mode_none`, `test_view_mode_custom`, `test_view_mode_masked` +- `tests/test_aggregate_beads.py` — `test_build_beads_compaction` +- `tests/test_context_composition_phase3.py` — `test_group_files_by_dir`, `test_compute_file_stats` +- `tests/test_context_composition_phase6.py` — `test_view_mode_default_summary`, `test_view_mode_full`, `test_view_mode_none`, `test_view_mode_outline`, `test_view_mode_skeleton`, `test_view_mode_summary`, `test_view_mode_custom`, `test_view_mode_custom_empty_default_to_summary`, `test_files_section_rendering` +- `tests/test_tiered_context.py` — `test_build_tier3_context_exists`, `test_build_tier3_context_ast_skeleton`, `test_build_tier3_context_scaling`, `test_tiered_context_by_tier_field`, `test_build_file_items_with_tiers`, `test_build_files_section_with_dicts` +- `tests/test_ast_masking_core.py` — `test_ast_masking_gencpp_samples` +- `tests/test_gencpp_full_suite.py` — `test_gencpp_full_suite` +- `tests/test_perf_aggregate.py` — `test_build_tier3_context_scaling` +- `tests/test_history_management.py` — `test_aggregate_blacklist`, `test_aggregate_includes_segregated_history`, `test_aggregate_respects_*` +- `tests/test_ui_summary_only_removal.py` — `test_aggregate_from_items_respects_auto_aggregate` +- `tests/test_aggregate_helpers.py` — `test_resolve_paths_blacklist`, `test_resolve_paths_glob`, `test_resolve_paths_absolute` +- `tests/test_aggregate_perf.py` — `test_find_next_increment_*` + +--- + +## Cross-References + +- **The pipeline source:** `src/aggregate.py` (518 lines) +- **FileItem schema:** `src/models.py:510-559 FileItem` +- **ContextPreset schema:** `src/models.py:909-937 ContextPreset` +- **ContextPresetManager:** `src/context_presets.py` (30 lines) +- **AI client consumption:** `src/ai_client.py:_send_` × 5, see `guide_ai_client.md` +- **Tier 3 worker consumption:** `src/multi_agent_conductor.py:run_worker_lifecycle`, see `guide_multi_agent_conductor.md` +- **Per-file curation features:** `guide_context_curation.md` (Fuzzy Anchors, AST Inspector, Granular AST Control) +- **Cache strategy:** `guide_architecture.md §"Cache Hit Strategy"`, `guide_ai_client.md §"Caching"` +- **Discussion section builder:** `guide_discussions.md §"Persistence"`, `src/aggregate.py:109 build_discussion_section` +- **Deep-dive on the design philosophy:** `conductor/tracks/nagent_review_20260608/report.md §6` (per-file memory) +- **Actionable patterns for richer per-file memory:** `conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md §4` (file_id), §6 (git history), §7 (Meta-Tooling DSL) +- **Future-track candidate for per-file conversation log:** `conductor/tracks/nagent_review_20260608/decisions.md` candidate #7 diff --git a/docs/guide_context_curation.md b/docs/guide_context_curation.md index b01dc16d..9baf597a 100644 --- a/docs/guide_context_curation.md +++ b/docs/guide_context_curation.md @@ -1,6 +1,6 @@ # Advanced Context Curation -[Top](../README.md) | [Architecture](guide_architecture.md) | [Tools & IPC](guide_tools.md) | [MMA](guide_mma.md) | [Simulations](guide_simulations.md) +[Top](../README.md) | [Context Aggregation](guide_context_aggregation.md) | [Architecture](guide_architecture.md) | [Tools & IPC](guide_tools.md) | [MMA](guide_mma.md) | [Simulations](guide_simulations.md) --- @@ -301,3 +301,14 @@ The unified editor preserves the behavior of both predecessors: - The "Apply" action writes the modified `ast_mask` and slice list to the file item in a single transaction. This is a UX consolidation, not a data model change. The underlying `ast_mask: dict[str, str]` and slice list structures are unchanged. + +--- + +## See Also + +- **[guide_context_aggregation.md](guide_context_aggregation.md)** — The full `aggregate.py` pipeline that consumes the FileItem schema documented here. Includes the 7 `view_mode` values (`full`, `summary`, `skeleton`, `outline`, `masked`, `none`, `custom`) and the 3 `aggregation_strategy` values (`auto`, `summarize`, `full`) +- **[guide_context_presets.md](guide_context_presets.md)** *(planned)* — The `ContextPreset` schema (named, persisted set of FileItems) +- **[guide_models.md](guide_models.md)** — Full FileItem and ContextPreset dataclass definitions at `src/models.py:510` and `src/models.py:909` +- **[guide_architecture.md](guide_architecture.md)** — How the FileItem list is built up in `App.init_state` and how the aggregation pipeline consumes it +- **[conductor/tracks/nagent_review_20260608/report.md §6](../../conductor/tracks/nagent_review_20260608/report.md)** — Deep-dive on per-file memory; compares Manual Slop's curation dimension (this guide) to nagent's conversation-log dimension +- **[conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md §4](../../conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md)** — Actionable: add a `file_id: st_dev:st_ino` field to FileItem for rename-safe identity diff --git a/docs/guide_discussions.md b/docs/guide_discussions.md new file mode 100644 index 00000000..52e6b53d --- /dev/null +++ b/docs/guide_discussions.md @@ -0,0 +1,353 @@ +# Discussions: Takes, Branching, and Per-Entry Editing + +[Top](../README.md) | [App Controller](guide_app_controller.md) | [GUI Main](guide_gui_2.md) | [Models](guide_models.md) + +--- + +## Overview + +A **Discussion** is Manual Slop's first-class unit of conversation. Every prompt the user types, every AI response, every tool result, every per-entry edit lives in a Discussion. Discussions are persisted to the project's TOML as a typed list of entries; they can be branched into multiple **Takes**, switched between, renamed, deleted, and (most importantly) **edited at the entry level** by the user in the GUI. + +The discussion system is one of the *most-edited* surfaces in Manual Slop. The user can: + +- **Edit any entry's text** in place (full multi-line edit, not just inline) +- **Insert new entries** at any position +- **Delete any entry** by position +- **Change the role** of any entry +- **Branch** at any entry to create a new Take +- **Undo/redo** every edit (Ctrl+Z / Ctrl+Y) +- **Promote** a Take to a top-level discussion + +This is a *deliberate* design choice. Manual Slop treats the discussion as user-editable working state, not as opaque chat history. The full operation matrix and the rationale are in `conductor/tracks/nagent_review_20260608/report.md §3`; this guide covers the *implementation*. + +> **Domain classification.** The discussion system is purely **Application**-domain. It owns no Meta-Tooling concerns; it does not call into `scripts/mma_exec.py`; it is consumed by the GUI and the headless controller, and projected to the AI client. See `guide_meta_boundary.md` for the Application vs Meta-Tooling split. + +--- + +## Data Model + +### The Entry Dict + +The smallest unit of a discussion is the **entry**, a `dict[str, Any]` with this shape (`src/models.py:parse_history_entries` builds it; `src/gui_2.py:render_discussion_entry` reads it): + +| Field | Type | Source | Purpose | +|---|---|---|---| +| `role` | `str` | `parse_history_entries` | The speaker. Defaults to one of `["User", "AI", "Vendor API", "System"]` (set in `models.py:208`), but `disc_roles` is user-editable so this can be any string. | +| `content` | `str` | user input / LLM response | The entry's text. Fully editable in the GUI. | +| `collapsed` | `bool` | GUI render state | Whether the entry is collapsed to a 60-char preview. Defaults `True`. | +| `ts` | `str` | `project_manager.now_ts()` | ISO timestamp, prefixed with `@` in the persisted form. | +| `thinking_segments` | `list[dict]` | `src/thinking_parser.py` | AI entries with `` blocks have the blocks parsed out into collapsible segments. | +| `usage` | `dict` | `ai_client.send()` | Token accounting: `{"input_tokens": N, "output_tokens": N, "cache_read_input_tokens": N}`. | +| `read_mode` | `bool` | GUI render state | If `True`, render as Markdown; if `False` (default), render as editable text input. | + +An entry dict is *open*: extra keys are allowed and ignored by the renderer. This is intentional — the user can add custom metadata via the Hook API or by editing the project TOML directly. + +### The Discussion Dict + +A **Discussion** is a `dict[str, Any]` under `project.discussion.discussions[]`: + +```python +{ + "history": [str, ...] # legacy: list of "Role: content" strings + # OR + # list of entry dicts (new format) + "git_commit": str, # git SHA at the time the discussion was last updated + "last_updated": str, # ISO timestamp + "context_snapshot": [dict, ...], # list of FileItem.to_dict() at send time + "sent_markdown": str, # the actual markdown sent to the AI on the last send + "sent_system_prompt": str, # the system prompt that was active at send time +} +``` + +The `project_manager.default_discussion()` factory returns a fresh dict with empty `history` and the standard keys. `app_controller._switch_discussion` reads the dict, parses `history` via `models.parse_history_entries(history_strings, self.disc_roles)`, and writes the live `disc_entries` list. + +### The Take Naming Convention + +Takes are encoded in the discussion name. A Take's name has the shape `_take_`. Example: a discussion named `refactor_auth` can have takes `refactor_auth_take_1`, `refactor_auth_take_2`, etc. The `_get_discussion_names` accessor groups by base name (`name.split("_take_")[0]`) so the GUI can render them as nested tabs. + +The `_branch_discussion(index)` method (in `app_controller.py:3503`) generates a unique Take name by incrementing `_take_` until it finds an unused name, then calls `project_manager.branch_discussion(self.project, self.active_discussion, new_name, index)`. + +--- + +## Per-Entry Operations (the A1-A7 matrix) + +This is the operation set the user has *per individual entry*. Renderer: `src/gui_2.py:3770 render_discussion_entry(app, entry, index)`. + +| # | Operation | GUI control | Source code | What it does | +|---|---|---|---|---| +| A1 | **Edit content in place** | `imgui.input_text_multiline` on the entry body | `gui_2.py:3841` | `entry["content"]` is a fully editable multi-line text input. The user can rewrite an AI's response, fix a typo in their own prompt, paste in code from another source, etc. | +| A2 | **Toggle read/edit mode** | `[Edit]` / `[Read]` button | `gui_2.py:3799` | When in `[Read]` mode, the content is rendered as Markdown with syntax highlighting (`render_discussion_entry_read_mode` at `gui_2.py:3855`). When in `[Edit]` mode, the multi-line text input is shown. | +| A3 | **Toggle collapsed/expanded** | `+/-` button per entry | `gui_2.py:3789` | Collapsed entries show a 60-char preview (line 3822-3824). Expanded entries show full content. | +| A4 | **Change role** | Combo box from `app.disc_roles` | `gui_2.py:3793-3796` | The entry's `role` field is editable. The list `app.disc_roles` is itself user-managed (see §"Role Management" below). | +| A5 | **Insert entry before this one** | `Ins` button | `gui_2.py:3813` | `app.disc_entries.insert(index, {"role": "User", "content": "", "collapsed": True, "ts": project_manager.now_ts()})` | +| A6 | **Delete this entry** | `Del` button | `gui_2.py:3815-3816` | `if entry in app.disc_entries: app.disc_entries.remove(entry)`. The membership check matters — ImGui can re-render stale state, so the check guards against double-delete. | +| A7 | **Branch at this entry** | `Branch` button | `gui_2.py:3821` → `app._branch_discussion(index)` → `app_controller._branch_discussion:3503` → `project_manager.branch_discussion:429` | Creates a new Take named `_take_` and copies the history up to and including `index` into the new Take. The user is then switched to the new Take. | + +**Why this matrix is load-bearing.** Every entry is independently editable. There is no "edit the whole discussion as one operation." This is the design difference vs. most chat UIs: when an AI's response is wrong, the user can *fix the response text* without losing the entry's role, timestamp, usage accounting, or thinking segments. The AI on the *next* turn sees the corrected response (because the entry's `content` is the source for `build_discussion_section` in `aggregate.py:109`). + +--- + +## Discussion-Level Operations (the B1-B11 matrix) + +These are the second-tier controls, rendered at `src/gui_2.py:4239 render_discussion_entry_controls(...)` and the discussion selector at `gui_2.py:4330 render_discussion_selector(...)`. + +| # | Operation | GUI control | Source code | What it does | +|---|---|---|---|---| +| B1 | **Append new entry** | `+ Entry` button | `gui_2.py:4240` | `app.disc_entries.append({...})` with the default role from `app.disc_roles[0]`. | +| B2 | **Collapse all / Expand all** | `-All` / `+All` buttons | `gui_2.py:4242-4246` | Bulk-set `collapsed` flag on every entry. | +| B3 | **Clear all** | `Clear All` button | `gui_2.py:4248` | `app.disc_entries.clear()`. Note: this clears the *current* take, not all takes. | +| B4 | **Save (flush to project TOML)** | `Save` button | `gui_2.py:4250` | `app._flush_to_project(); app._flush_to_config(); app.save_config()`. | +| B5 | **Add/remove roles** | `Add` / `X` buttons under "Roles" | `gui_2.py:4317-4328` | `app.disc_roles.append(r)` / `app.disc_roles.pop(i)`. | +| B6 | **Switch active discussion** | Discussion combo + Take tabs | `gui_2.py:4197, 4344, 4354` | `app._switch_discussion(name)`. Takes group by base name and render as nested tabs. | +| B7 | **Rename / Delete discussion** | `Rename` / `Delete` buttons | `gui_2.py:4291, 4293` | `app._rename_discussion(...)` / `app._delete_discussion(...)`. Cannot delete the last discussion (guarded at `app_controller.py:3543`). | +| B8 | **Promote Take to top-level** | `Promote` button in takes panel | `gui_2.py:4364` | `project_manager.promote_take(app.project, app.active_discussion, new_name)` — renames a Take (e.g. `T0_take_2`) to a fresh top-level discussion name. | +| B9 | **Per-role filter** | `ui_focus_agent` selector (system-wide) | `gui_2.py:4230-4234` | `display_entries = [e for e in app.disc_entries if e.get("role") == persona_name or e.get("role") == "User"]`. The filter follows the MMA persona focus. | +| B10 | **Truncate to N pairs** | `Truncate` button + `drag_int` | `gui_2.py:4254-4260` | `truncate_entries(app.disc_entries, app.ui_disc_truncate_pairs)` keeps the last `N` User/AI pairs (per `gui_2.py:175 truncate_entries(...)`). | +| B11 | **Compress (AI summarization)** | `Compress` button | `gui_2.py:4252` → `app_controller._handle_compress_discussion:3357` | Calls `ai_client.run_discussion_compression(disc_text)` and replaces the discussion with the LLM's compressed version. | + +--- + +## Role Management + +`app.disc_roles: list[str]` is the master list of valid role strings. It's: + +- **Populated from** `models.parse_history_entries`'s default `["User", "AI", "Vendor API", "System"]` (`models.py:208`) +- **Persisted as** `manual_slop.toml [discussion].disc_roles` (or a project TOML equivalent) +- **Loaded by** `app_controller.init_state` from the project dict + +The user can add or remove roles at runtime via `gui_2.py:4317-4328 render_discussion_roles`. The `Add` button takes `app.ui_disc_new_role_input`, strips it, and appends if not already present. The `X` button pops by index. + +A role can be any string — Manual Slop doesn't enforce a vocabulary. Typical custom roles include `Context`, `Tool`, `CodeBlock`, `Error`, `Warning`, or per-project names like `Architect` vs `Implementer`. + +The **default role** for new entries is `app.disc_roles[0] if app.disc_roles else "User"`. If the role list is empty, the system falls back to `"User"`. This is intentionally permissive — empty role list is never an error. + +--- + +## Take Lifecycle + +### Branch + +`app_controller._branch_discussion(index)` (`app_controller.py:3503-3519`): + +1. Flush current `disc_entries` to project TOML via `_flush_disc_entries_to_project` (so we don't lose unsaved edits). +2. Compute the base name: `self.active_discussion.split("_take_")[0]`. +3. Generate a unique take name: `_take_` incremented until unused. +4. Call `project_manager.branch_discussion(self.project, self.active_discussion, new_name, index)`. +5. Switch the active discussion to the new take via `_switch_discussion(new_name)`. + +`project_manager.branch_discussion` (`project_manager.py:429`) does the actual copy: +- Reads the source discussion +- Creates a fresh discussion dict with `default_discussion()` +- Copies the source's `git_commit` (so the new take is anchored to the same code state) +- Copies `source_disc["history"][:message_index + 1]` — i.e. **all entries up to and including `index`** +- Sets the new take as active + +**Why "up to and including"?** Branching at entry N means "the future starts from entry N's state." The user is saying "from here, what if I had asked a different follow-up?" The AI sees entries 0..N as the prior conversation; entries N+1..end are discarded (in this take — they're still in the parent take, accessible via the Take tabs). + +### Promote + +`project_manager.promote_take` (`project_manager.py:447`): + +- Renames a take to a fresh top-level name +- Updates the `active` pointer if the renamed take was active +- Use case: a Take that turned out to be the "real" conversation gets renamed away from the `_take_` suffix to become a first-class discussion + +### Switch + +`app_controller._switch_discussion(name)` (`app_controller.py:3199`): + +1. Flush the current `disc_entries` to the project TOML. +2. Look up the new discussion in `self.project["discussion"]["discussions"]`. +3. Set `self.active_discussion = name` and `self._track_discussion_active = False`. +4. **Atomically** (under `_disc_entries_lock`) replace `self.disc_entries[:] = models.parse_history_entries(disc_data.get("history", []), self.disc_roles)`. +5. Restore the context snapshot from `disc_data["context_snapshot"]` if present. +6. Update `ai_status = f"discussion: {name}"`. + +The atomic slice-replacement is critical: a renderer that reads `self.disc_entries` mid-update would see a half-empty list. The lock ensures the renderer only sees the old list (before) or the new list (after), never an in-between state. + +### Rename / Delete + +`_rename_discussion(old, new)` (`app_controller.py:3521`): + +- `discussions[new_name] = discussions.pop(old_name)` — atomically swaps the key +- Updates `active_discussion` and the `active` pointer if the renamed discussion was active +- Rejects the rename if `new_name` is already in use (`ai_status = f"discussion '{new_name}' already exists"`) + +`_delete_discussion(name)` (`app_controller.py:3537`): + +- Refuses to delete the last remaining discussion (guarded at line 3543) +- Removes the discussion from the dict +- If the deleted discussion was active, switches to the first remaining sorted-by-name discussion + +--- + +## Per-Role Filter (the MMA Link) + +`gui_2.py:4227-4237 render_discussion_entries` filters the entry list when `app.ui_focus_agent` is set: + +```python +if app.ui_focus_agent: + tier_usage = app.mma_tier_usage.get(app.ui_focus_agent) + if tier_usage: + persona_name = tier_usage.get("persona") + if persona_name: + display_entries = [e for e in app.disc_entries + if e.get("role") == persona_name or e.get("role") == "User"] +``` + +When the user clicks "Focus on Tier 3 Worker A" in the MMA dashboard, the Discussion Hub filters to only show entries whose `role` matches the focused worker's persona name plus User entries. This is a *read-only* filter — the underlying `disc_entries` is unchanged. The `app._render_message_panel` (or whoever sent the entries) is unaffected. + +--- + +## Persistence + +### `app._flush_to_project` (called from B4 Save, and from `_switch_discussion`) + +`gui_2.py:1046-1047` and `app_controller.py:2558`: + +```python +app._flush_to_project() # serializes self.project to /.toml +app._flush_to_config() # serializes self.config to /config.toml +app.save_config() # write config.toml to disk +``` + +`_flush_to_project` calls `project_manager.save_project(self.project, self.active_project_path)`, which serializes the full project dict (including all discussions) to the project TOML. + +### `_flush_disc_entries_to_project` (called from `_switch_discussion` and `_branch_discussion`) + +`app_controller.py:3225-3240`: + +```python +def _flush_disc_entries_to_project(self) -> None: + history_strings = [project_manager.entry_to_str(e) for e in self.disc_entries] + if self.active_track and self._track_discussion_active: + project_manager.save_track_history(self.active_track.id, history_strings, self.active_project_root) + return + disc_sec = self.project.setdefault("discussion", {}) + discussions = disc_sec.setdefault("discussions", {}) + disc_data = discussions.setdefault(self.active_discussion, project_manager.default_discussion()) + disc_data["history"] = history_strings + disc_data["last_updated"] = project_manager.now_ts() + disc_data["context_snapshot"] = [f.to_dict() if hasattr(f, "to_dict") else {"path": str(f)} for f in self.context_files] + disc_data["sent_markdown"] = getattr(self, "discussion_sent_markdown", "") + disc_data["sent_system_prompt"] = getattr(self, "discussion_sent_system_prompt", "") +``` + +**Two paths:** +- If a track discussion is active (`self.active_track and self._track_discussion_active`): persist to `conductor/tracks//track_history` via `save_track_history`. +- Otherwise: persist to the project's `discussion.discussions[]` dict. + +`entry_to_str(e)` converts an entry dict to a `Role: content` string for the legacy `history` field. `parse_history_entries` (in `models.py:196`) reverses the conversion when loading. + +**The `context_snapshot`** is the FileItem list at send time. Restoring a discussion restores the file list (per `_switch_discussion:3218-3222`). This is *the* mechanism for "I sent this discussion with these files in context; if I switch away and back, the files come back." + +### When is the save triggered? + +- **Explicit:** B4 `Save` button. +- **Implicit (and risky):** `_switch_discussion` and `_branch_discussion` both flush *before* switching. **But** the per-entry edit operations (A1-A7) do *not* flush on their own. The user is expected to either Save explicitly or rely on the next `_switch_discussion` / `_branch_discussion` to flush. + +This is a known design tension. See the "Known Limitations" section below. + +--- + +## Threading & Locking + +`self._disc_entries_lock: threading.Lock` is a `threading.Lock` owned by `app_controller`. It is acquired in: + +- `_switch_discussion` (`app_controller.py:3214-3215`) — to atomically replace `disc_entries[:]` +- `app._process_pending_gui_tasks` (called from render loop) — to read entries safely while a background thread appends an AI response +- `truncate_entries` (via the panel-level `Truncate` button) — to atomically replace `disc_entries` with the truncated list +- `gui_2.py:4060, 4223-4224` — the AI response callback appends a new entry under the lock +- `gui_2.py:4359` (in `render_discussion_selector` when track-discussion is toggled) — flushes under the lock + +**Invariant:** the lock is *never* held across a render call. The lock is acquired, `disc_entries[:] = ...` is done, the lock is released. The ImGui renderer reads `disc_entries` lock-free; it sees either the old list or the new list but never a half-updated one. + +**Cross-thread append pattern** (the AI response callback at `gui_2.py:4060`): + +```python +with app._disc_entries_lock: + app.disc_entries.append({"role": "user", "content": prompt, "collapsed": False, "ts": project_manager.now_ts()}) +``` + +The background thread (e.g. `_bg_task`) appends; the render thread reads. The lock is the *only* synchronization primitive — there is no event loop, no message queue, no signal. The render thread polls at frame rate (60 FPS nominal); if the background thread appends between frames, the next frame sees the new entry. + +--- + +## Undo/Redo Integration + +The discussion system is integrated with `HistoryManager` + `UISnapshot` for full undo/redo. See `guide_state_lifecycle.md` for the full architecture. The relevant details for discussions: + +- `UISnapshot.disc_entries: list[dict]` (`src/history.py:19`) captures the full entry list via `copy.deepcopy(self.disc_entries)` (`gui_2.py:748`). +- The change-detection logic at `gui_2.py:1160, 1166-1167` checks if `disc_entries` length or last-entry content changed; if so, a new snapshot is pushed to the undo stack. +- `Ctrl+Z` restores the previous `disc_entries` via `gui_2.py:754 _apply_snapshot`. + +**Per-edit granularity.** A snapshot is pushed *per render frame* that detects a change. The 100-snapshot cap means you can rewind up to ~100 edits. For a 5-second window of rapid typing, that's a lot. For long sessions with infrequent edits, the history can span hours. + +--- + +## Reset (Destroying the Discussion) + +`app_controller._handle_reset_session` (`app_controller.py:3286-3356`) is the **nuclear** reset: + +- `self.disc_entries.clear()` — empties the current take +- `for d_name in discussions: discussions[d_name]["history"] = []` — empties ALL takes and ALL discussions +- Resets `discussion_sent_markdown` and `discussion_sent_system_prompt` to `""` +- Resets the entire project dict to `default_project(...)` — this is a *new* empty project, not the user's saved one + +**The reset is intentionally aggressive.** The 2026-06-08 `_handle_reset_session` regression (documented in the comments at `app_controller.py:3307-3312`) was caused by an early version that *also* cleared `self.active_project_path`, leading to an infinite re-switch loop. The fix is to leave `active_project_path` alone. + +**What reset does NOT touch:** +- `self.project` is replaced, but the user's *saved* project TOML on disk is untouched. Switching projects after reset reloads from disk. +- `app.history` (the `HistoryManager`) is not cleared. The undo stack survives a reset — Ctrl+Z after a reset can restore the pre-reset discussion state. This may be a bug or a feature depending on user expectation. +- `self.active_project_path` is preserved. + +--- + +## Hook API Surface + +The discussion system is exposed to the Hook API via two endpoints (per `guide_tools.md`): + +| Method | Endpoint | Behavior | +|---|---|---| +| `GET /api/session` | Direct read | `{"session": {"entries": [...]}}` from `app.disc_entries` | +| `POST /api/session` | `{"session": {"entries": [...]}}` | `{"status": "updated"}` — sets `app.disc_entries` | + +The POST endpoint allows external automation to *replace* the entire discussion. Per-entry inserts/deletes are not currently exposed via the Hook API (only full-replacement). This is a known gap. + +`api_hook_client.py` exposes `get_session()` and `set_session(entries)` as the Python-side wrappers. + +--- + +## Tests + +- `tests/test_discussion_takes.py` — `TestDiscussionTakes` covers `branch_discussion` (creates a new Take) and `promote_take` (renames a Take to top-level). +- `tests/test_gui_discussion_tabs.py` — `test_discussion_tabs_rendered` covers the discussion selector and Take tabs. +- `tests/test_discussion_takes_gui.py` — `test_render_discussion_tabs` and `test_switching_discussion_via_tabs` cover the GUI flow. +- `tests/test_history.py` — `test_undo_redo`, `test_jump_to_undo`, `test_max_capacity`, `test_redo_cleared_on_push`, `test_push_state` cover the undo/redo integration. +- `tests/test_history_manager.py` — `TestHistoryManager` covers `snapshot_roundtrip`, `push_and_undo`, `push_clears_redo_stack`, `undo_and_redo`, `undo_no_history_returns_none`, `redo_no_history_returns_none`, `get_history_returns_descriptions`, `jump_to_undo`. +- `tests/test_session_logger_reset.py` — `test_reset_session` covers the reset path. +- `tests/test_gui_fast_render.py` — `test_render_discussion_panel_fast` covers the render path. +- `tests/test_gui_phase4.py` — `test_track_discussion_toggle` covers the track-discussion toggle. +- `tests/test_gui_symbol_navigation.py` — `test_render_discussion_panel_symbol_lookup` covers the `@Symbol` lookup integration. + +--- + +## Known Limitations + +1. **Per-edit save is implicit.** The per-entry edit operations (A1-A7) do not flush to TOML on every edit. The save happens on the next `_switch_discussion`, `_branch_discussion`, or explicit B4 Save. A crash between edit and save loses the edit. Fix: hook the change-detection logic in `gui_2.py:1160, 1166-1167` to also call `_flush_disc_entries_to_project` after a debounce. +2. **Provider-side history diverges from `disc_entries`.** When the user edits an entry's `content` via A1, the *displayed* text is corrected but the *provider-side* `ai_client._anthropic_history` (and siblings) still contains the original. The next LLM call may replay the original tool results. This is Pitfall #4 in `conductor/tracks/nagent_review_20260608/report.md` and the corresponding Decision candidate #3 (Stateless LLMClient). +3. **Hook API is full-replacement only.** No per-entry insert/delete via the API. The user could `POST /api/session` with a new list, but partial edits require the full list. +4. **Truncate is destructive.** The `Truncate` button (B10) is not undoable as a single operation — it's a list replacement, so the undo stack pushes the new (truncated) list, not the pre-truncate list. Actually, it *is* pushed (per the change-detection logic), so Ctrl+Z restores the pre-truncate list. Confirmed working in `tests/test_history.py`. + +--- + +## Cross-References + +- **Discussion data model:** `src/models.py:196 parse_history_entries`, `src/models.py:909 ContextPreset`, `src/models.py:510 FileItem` +- **Discussion persistence:** `src/project_manager.py:429 branch_discussion`, `src/project_manager.py:447 promote_take`, `src/project_manager.py:396 calculate_track_progress` +- **Discussion switching/management:** `src/app_controller.py:3199 _switch_discussion`, `src/app_controller.py:3225 _flush_disc_entries_to_project`, `src/app_controller.py:3286 _handle_reset_session`, `src/app_controller.py:3357 _handle_compress_discussion`, `src/app_controller.py:3503 _branch_discussion`, `src/app_controller.py:3521 _rename_discussion`, `src/app_controller.py:3537 _delete_discussion` +- **GUI render functions:** `src/gui_2.py:175 truncate_entries`, `src/gui_2.py:735 _take_snapshot`, `src/gui_2.py:754 _apply_snapshot`, `src/gui_2.py:3770 render_discussion_entry`, `src/gui_2.py:4227 render_discussion_entries`, `src/gui_2.py:4239 render_discussion_entry_controls`, `src/gui_2.py:4317 render_discussion_roles`, `src/gui_2.py:4330 render_discussion_selector` +- **Undo/redo integration:** `src/history.py:8 UISnapshot`, `src/history.py:71 HistoryManager` +- **Deep-dive on the design philosophy:** `conductor/tracks/nagent_review_20260608/report.md §3` (the 23-operation matrix A1-C5) +- **Actionable patterns for sub-agents in 1:1 discussions:** `conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md` §3 and §10 +- **Future-track candidate for raw-transcript persistence:** `conductor/tracks/nagent_review_20260608/decisions.md` candidate #10 diff --git a/docs/guide_gui_2.md b/docs/guide_gui_2.md index 35dab250..cfaeb6bf 100644 --- a/docs/guide_gui_2.md +++ b/docs/guide_gui_2.md @@ -1,6 +1,6 @@ # `src/gui_2.py` — Main ImGui Application -[Top](../README.md) | [Architecture](guide_architecture.md) | [Testing](guide_testing.md) +[Top](../README.md) | [Architecture](guide_architecture.md) | [Discussions](guide_discussions.md) | [State Lifecycle](guide_state_lifecycle.md) | [Context Aggregation](guide_context_aggregation.md) | [Testing](guide_testing.md) --- @@ -474,4 +474,8 @@ uv run python -c "import ast; tree = ast.parse(open('src/gui_2.py').read()); [pr - **[guide_testing.md](guide_testing.md)** — Test infrastructure for GUI tests - **[guide_hot_reload.md](guide_hot_reload.md)** — How Ctrl+Alt+R reloads this file - **[guide_themes.md](guide_themes.md)** — TOML theme system; defines the `C_*` callable color helpers used throughout `gui_2.py` +- **[guide_discussions.md](guide_discussions.md)** — The Discussion system that the GUI's `render_discussion_entry`/`render_discussion_selector`/etc. render +- **[guide_state_lifecycle.md](guide_state_lifecycle.md)** — Undo/redo (`HistoryManager` + `UISnapshot`) and `App.__getattr__`/`__setattr__` state delegation +- **[guide_context_aggregation.md](guide_context_aggregation.md)** — The `aggregate.py` pipeline that consumes the GUI's `files` + `context_files` + `history` config - **[conductor/product-guidelines.md](../../conductor/product-guidelines.md)** — The UI delegation pattern rules +- **[conductor/tracks/nagent_review_20260608/report.md](../../conductor/tracks/nagent_review_20260608/report.md)** — Deep-dive comparison of Manual Slop's discussion system to nagent's pattern; includes the per-entry (A1-A7) + discussion-level (B1-B11) + undo/redo (C1-C5) operation matrix diff --git a/docs/guide_mma.md b/docs/guide_mma.md index 8c2ab92b..f3a99929 100644 --- a/docs/guide_mma.md +++ b/docs/guide_mma.md @@ -562,3 +562,15 @@ The `WorkspaceManager` (`src/workspace_manager.py`) supports binding workspace p - This is opt-in via `[conductor].auto_switch_profiles = true` in `config.toml`. See [guide_workspace_profiles.md](guide_workspace_profiles.md) (placeholder; written in Task 10) for the full profile schema. + +--- + +## See Also + +- **[guide_architecture.md](guide_architecture.md)** — Threading model that MMA's worker pool respects +- **[guide_multi_agent_conductor.md](guide_multi_agent_conductor.md)** — The `multi_agent_conductor.py` + `dag_engine.py` runtime +- **[guide_app_controller.md](guide_app_controller.md)** — How the AppController drives MMA via `_cb_start_track`, `_do_generate`, `_process_event_queue` +- **[guide_context_aggregation.md](guide_context_aggregation.md)** — The `aggregate.py:build_tier3_context` variant used by MMA workers +- **[guide_discussions.md](guide_discussions.md)** — The Discussion system; MMA worker prompts are built from the active discussion +- **[conductor/tracks/nagent_review_20260608/report.md §9](../../conductor/tracks/nagent_review_20260608/report.md)** — Deep-dive on the MMA sub-conversation pattern vs nagent's `` tag; **the highest-priority future-track is to extract MMA's `run_worker_lifecycle` into a reusable `SubConversationRunner` for 1:1 discussions** (per user-flagged want) +- **[conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md §3 and §10](../../conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md)** — Actionable patterns for the SubConversationRunner; the design constraint that sub-agents return a *concise artifact* (not a full transcript) is baked into the recommendation diff --git a/docs/guide_models.md b/docs/guide_models.md index d150d9e6..46185e9d 100644 --- a/docs/guide_models.md +++ b/docs/guide_models.md @@ -551,6 +551,9 @@ Tests live in `tests/test_models.py` and module-specific test files (e.g., `test - **[guide_personas.md](guide_personas.md)** — `Persona` model in detail - **[guide_workspace_profiles.md](guide_workspace_profiles.md)** — `WorkspaceProfile` model in detail - **[guide_rag.md](guide_rag.md)** — `RAGConfig`, `RAGChunk`, `RAGResult` models +- **[guide_context_aggregation.md](guide_context_aggregation.md)** — How the `FileItem` and `ContextPreset` schemas flow through the `aggregate.py` pipeline +- **[guide_discussions.md](guide_discussions.md)** — The entry dict shape (`{role, content, collapsed, ts, ...}`) consumed by `parse_history_entries` - **`src/presets.py`, `src/personas.py`, `src/context_presets.py`, `src/tool_presets.py`** — Managers that use these models - **`src/multi_agent_conductor.py`** — Uses `Ticket`, `Track`, `WorkerContext` +- **[conductor/tracks/nagent_review_20260608/report.md §6](../../conductor/tracks/nagent_review_20260608/report.md)** — Deep-dive on the `FileItem` schema as Manual Slop's strongest curation dimension - **`src/ai_client.py`** — Uses `Provider`, `ModelInfo`, `AIRequest`, `AIResponse` diff --git a/docs/guide_state_lifecycle.md b/docs/guide_state_lifecycle.md new file mode 100644 index 00000000..62bfce25 --- /dev/null +++ b/docs/guide_state_lifecycle.md @@ -0,0 +1,375 @@ +# State Lifecycle: Undo/Redo, Reset, and State Delegation + +[Top](../README.md) | [App Controller](guide_app_controller.md) | [Discussions](guide_discussions.md) | [GUI Main](guide_gui_2.md) + +--- + +## Overview + +Manual Slop's state lifecycle has three load-bearing concerns: + +1. **Undo/Redo** via `HistoryManager` + `UISnapshot` — the non-provider history system +2. **Reset** via `_handle_reset_session` — the "throw away everything" flow +3. **State delegation** via `App.__getattr__`/`__setattr__` — the App is a thin proxy over the Controller + +This guide covers the *implementation* of all three. The design philosophy is documented in `conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md §1` (state visibility) and `§9` (edit-the-input, not the output). + +> **Domain classification.** All three concerns are **Application**-domain. The Hook API (which exposes state) is also Application, but crosses into Meta-Tooling via the bridge scripts. See `guide_meta_boundary.md`. + +--- + +## 1. Undo/Redo: `HistoryManager` + `UISnapshot` + +### Why a Non-Provider History? + +Manual Slop's history is **not** the same as the provider-side conversation history. The provider (`ai_client._anthropic_history`, `_deepseek_history`, etc.) tracks the *exact bytes* sent to and received from the LLM. The Manual Slop history (`app.history` + `UISnapshot`) tracks the *user's UI state* — text inputs, sliders, file lists, discussions. + +This separation is intentional. It means: +- The user can `Ctrl+Z` to undo a typo in the AI input box, even if the previous LLM call's history is preserved on the provider side. +- The provider's history is the *authoritative* transcript for the LLM; the Manual Slop history is the *user's working state*. +- A reset of the Manual Slop state does not clear the provider's history (and vice versa) — see §"Reset" below. + +This is exactly the kind of "edit the input, not the output" pattern nagent uses; see nagent takeaways §9. + +### `UISnapshot` — The Serializable State + +`src/history.py:8 UISnapshot` is a frozen-shaped dataclass capturing the 13 user-mutable fields: + +| Field | Type | Source | +|---|---|---| +| `ai_input` | `str` | `self.ui_ai_input` | +| `project_system_prompt` | `str` | `self.ui_project_system_prompt` | +| `global_system_prompt` | `str` | `self.ui_global_system_prompt` | +| `base_system_prompt` | `str` | `self.ui_base_system_prompt` | +| `use_default_base_prompt` | `bool` | `self.ui_use_default_base_prompt` | +| `temperature` | `float` | `self.temperature` | +| `top_p` | `float` | `self.top_p` | +| `max_tokens` | `int` | `self.max_tokens` | +| `auto_add_history` | `bool` | `self.ui_auto_add_history` | +| `disc_entries` | `list[dict]` | `copy.deepcopy(self.disc_entries)` | +| `files` | `list[dict]` | `[f.to_dict() if hasattr(f, 'to_dict') else f for f in self.files]` | +| `context_files` | `list[dict]` | `[f.to_dict() if hasattr(f, 'to_dict') else f for f in self.context_files]` | +| `screenshots` | `list[str]` | `list(self.screenshots)` | + +`to_dict()` / `from_dict()` are explicit serializers, used by the Hook API for the `/api/session` endpoint. The dataclass is **not** auto-serialized; explicit `to_dict` is required so the schema is documented. + +### `HistoryManager` — The Undo/Redo Stack + +`src/history.py:71 HistoryManager` is a 100-snapshot capacity stack: + +```python +class HistoryManager: + def __init__(self, max_capacity: int = 100): ... + def push(self, state: typing.Any, description: str) -> None: ... + def undo(self, current_state: typing.Any, current_description: str = "Current State") -> typing.Optional[HistoryEntry]: ... + def redo(self, current_state: typing.Any, current_description: str = "Current State") -> typing.Optional[HistoryEntry]: ... + def jump_to_undo(self, index: int, current_state: typing.Any, current_description: str = "Before Jump") -> typing.Optional[HistoryEntry]: ... + @property + def can_undo(self) -> bool: ... + @property + def can_redo(self) -> bool: ... + def get_history(self) -> typing.List[typing.Dict[str, typing.Any]]: ... +``` + +- `push` appends a new entry; clears the redo stack; pops the oldest if capacity exceeded. +- `undo` moves the current state to the redo stack and returns the top of the undo stack. +- `redo` is the inverse. +- `jump_to_undo` allows time-traveling to any past snapshot, moving subsequent states to the redo stack. +- `get_history` returns `[{description, timestamp}, ...]` for the History List view in the GUI. + +The `max_capacity=100` is the default and is sufficient for a 5-second window of rapid typing or a longer session of infrequent edits. + +### The Push Trigger — `gui_2.py:1140-1170` + +The undo stack is *not* pushed on every keystroke. It's pushed via debounced change-detection at the start of every render frame: + +```python +current = self._take_snapshot() +if self._last_ui_snapshot is None: + self._last_ui_snapshot = current + return + +changed = ( + current.ai_input != self._last_ui_snapshot.ai_input or + current.project_system_prompt != self._last_ui_snapshot.project_system_prompt or + # ... 10 more field comparisons ... + len(current.disc_entries) != len(self._last_ui_snapshot.disc_entries) or + len(current.files) != len(self._last_ui_snapshot.files) or + len(current.context_files) != len(self._last_ui_snapshot.context_files) or + len(current.screenshots) != len(self._last_ui_snapshot.screenshots) +) + +if not changed and len(current.disc_entries) > 0: + if current.disc_entries[-1].get('content') != self._last_ui_snapshot.disc_entries[-1].get('content'): + changed = True + +if changed: + self.history.push(current, description="") + self._last_ui_snapshot = current +``` + +The change detector compares: +- 7 scalar fields directly (`!=`) +- 2 float fields with epsilon (`abs(...) > 1e-5`) +- 4 list fields by length +- The `disc_entries[-1]["content"]` separately (because streaming AI responses can change the last entry's content without changing the length) + +**Performance:** The check is at the start of every render frame. `copy.deepcopy(self.disc_entries)` (line 748) is the most expensive part — O(N) where N is the entry count. For a 100-entry discussion, this is microseconds. The full snapshot push only happens when a change is detected. + +### The Apply Trigger — `gui_2.py:819 _apply_undo` / `_apply_redo` / `_apply_jump` + +Three wrapper methods invoke `_apply_snapshot(entry.state)` with the right description: + +- `_apply_undo(...)` — pops from undo, pushes current to redo, applies popped +- `_apply_redo(...)` — pops from redo, pushes current to undo, applies popped +- `_apply_jump(index, ...)` — invokes `jump_to_undo(index, current_state)`, applies the result + +`_apply_snapshot` is the *single* restore function (`gui_2.py:754-789`): + +```python +def _apply_snapshot(self, snapshot: history.UISnapshot) -> None: + self._is_applying_snapshot = True + try: + self.ui_ai_input = snapshot.ai_input + self.ui_project_system_prompt = snapshot.project_system_prompt + # ... 10 more assignments ... + self.disc_entries = snapshot.disc_entries + # Restore files as FileItem objects + from src import models + self.files = [] + for f in snapshot.files: + if isinstance(f, dict): + self.files.append(models.FileItem.from_dict(f)) + else: + self.files.append(models.FileItem(path=str(f))) + # ... similar for context_files, screenshots ... + finally: + self._is_applying_snapshot = False +``` + +The `_is_applying_snapshot` flag is set during the restore to prevent re-pushing a new snapshot from the changes made *by* the restore itself. (Without this, pressing Ctrl+Z would push a new snapshot identical to the restored one, which would *clear the redo stack* — making Ctrl+Y a no-op.) + +--- + +## 2. State Delegation: `App.__getattr__` / `__setattr__` + +### The Pattern + +`App` (`src/gui_2.py:264+`) is a thin wrapper around `AppController` (`src/app_controller.py:772+`). The wrapper exists because: +- The Controller is the *headless* orchestrator (no ImGui dependencies). +- The App is the *render* side (ImGui calls). +- Render functions take `app: App` and read state via `app.`. + +The simplest design would be: copy every Controller field to the App. But that's brittle (every new Controller field requires an App copy). The actual pattern uses Python's `__getattr__`/`__setattr__` for transparent delegation. + +### `App.__getattr__` — Read Fallthrough + +`gui_2.py:666-669`: + +```python +def __getattr__(self, name: str) -> Any: + if name == 'controller': + raise AttributeError(name) + return getattr(self.controller, name) +``` + +When `app.foo` is accessed and `foo` is not an instance attribute of `App`, Python falls through to `__getattr__`, which reads from `self.controller`. This means `app.disc_entries`, `app.history`, `app.temperature` all read from the Controller transparently. + +The `'controller'` exception prevents infinite recursion if the Controller is not yet set (during early `__init__`). + +### `App.__setattr__` — Write Fallthrough + +`gui_2.py:671-675`: + +```python +def __setattr__(self, name: str, value: Any) -> None: + if name != 'controller' and hasattr(self, 'controller') and hasattr(self.controller, name): + setattr(self.controller, name, value) + else: + object.__setattr__(self, name, value) +``` + +When `app.foo = bar` is assigned and the Controller has a `foo` attribute, the write goes *through* to the Controller. Otherwise it stores on the App instance. + +### Why This Matters + +- **No data duplication.** There is one source of truth (the Controller). The App never has its own copy of `disc_entries`, `temperature`, etc. +- **No boilerplate.** New Controller fields are automatically available via the App without code changes. +- **Backward-compatible.** Existing App fields (e.g. `app._is_applying_snapshot`) work because `object.__setattr__` is the fallback when the Controller doesn't have the field. + +### The 4 Edge Cases (per the `guide_gui_2.md` Known Issues) + +1. **`app.controller` is itself an App attribute** — it must not be delegated, hence the explicit `if name == 'controller'` guard. +2. **App instance attributes shadow Controller attributes** — the `object.__setattr__` fallback stores on the App; the next `__getattr__` call returns the App's value (because Python's normal attribute lookup finds the instance attribute first). This is sometimes intentional (e.g. `app._is_applying_snapshot` is App-local). +3. **Fields in `app._app` (not on the Controller)** are stored on the App. The `_app` field is the Controller's back-reference to the App (set at `gui_2.py:264`). +4. **Underscore-prefixed App-specific fields** like `app._mma_approval_open`, `app._pending_ask_dialog` are stored on the App because the Controller doesn't have them. They appear in the Controller's `__getattr__` mirror via `hasattr` check. + +### Known Fragility + +`guide_gui_2.md` notes: `ui_separate_context_preview`, `ui_separate_message_panel`, `ui_separate_response_panel`, `ui_separate_tool_calls_panel`, `ui_separate_external_tools`, `ui_discussion_split_h` are NOT in the Controller's `_settable_fields`, so `__setattr__` falls through to `object.__setattr__` and stores them on the App. This is intentional for window-separator flags (they're render-only and shouldn't pollute the Controller), but it does mean they don't survive a hot-reload of `App`. + +--- + +## 3. Reset: `_handle_reset_session` + +`src/app_controller.py:3286-3356 _handle_reset_session` is the **nuclear** reset, called from the "Reset Session" button in the message panel. + +### What It Clears + +| Group | What | Why | +|---|---|---| +| AI client | `ai_client.reset_session()` + `clear_comms_log()` | Clears provider-side history. | +| Tool stats | `_tool_log.clear()`, `_tool_stats.clear()`, `_comms_log.clear()` | Clears the in-memory activity logs. | +| Discussion | `self.disc_entries.clear()`; for each discussion in project: `discussions[d_name]["history"] = []` | Empties the *current* take and all takes across all discussions. | +| Files | `self.files.clear()`, `self.context_files.clear()` | Drops the FileItem lists. | +| Tracks | `self.tracks.clear()` | Drops the loaded tracks. | +| **Project dict (full replacement)** | `self.project = project_manager.default_project(...)` | The project is *replaced* with a fresh default, not mutated. | +| Project paths | `self.project_paths = []` | Clears the recent-projects list. | +| Project switch state | `_project_switch_in_progress = False` etc. | Resets the in-flight switch state machine. | +| AI status | `ai_status = "session reset"`, `ai_response = ""` | Status bar update. | +| UI inputs | `ui_ai_input = ""`, `ui_manual_approve = False`, `ui_auto_add_history = False` | Empties the message box. | +| MMA | `active_track = None`, `active_tier = None`, `mma_status = "idle"`, `proposed_tracks = []`, `active_tickets = []`, `engines.clear()`, `mma_streams.clear()`, `_worker_status.clear()` | Drops all MMA state. | +| Provider/model | `_current_provider = "gemini"`, `_current_model = "gemini-2.5-flash-lite"`, `ai_client.set_provider(...)` | Resets to defaults. | +| Locks + queues | `_pending_history_adds.clear()`, `_api_event_queue.clear()`, `_pending_gui_tasks.clear()` | Drains all queues under their locks. | +| Prompts | `ui_use_default_base_prompt = True`, all 3 system prompts = `''` | Resets to default base prompt. | +| Persona/tool settings | `ui_active_persona = ''`, `ui_active_tool_preset = None`, `ui_active_bias_profile = None` | Drops active persona. | +| Generation params | `temperature = 0.0`, `top_p = 1.0`, `max_tokens = 8192` | Defaults. | + +### What It Does NOT Touch + +| Field | Why preserved | +|---|---| +| `self.active_project_path` | `_do_project_switch` writes to this path; clearing it would cause OSError on next switch and an infinite re-switch loop. The 2026-06-08 regression test `test_context_sim_live` documents this. | +| `self.history` (the `HistoryManager`) | The undo stack survives a reset. Ctrl+Z after a reset can restore the pre-reset state. This may be a bug or a feature. | +| The on-disk `manual_slop.toml` | The saved project TOML is not deleted or rewritten. Switching projects after reset reloads from disk. | +| `self.discussion_sent_markdown` / `discussion_sent_system_prompt` | These *are* cleared (set to `""`). They're not in the preserve list. | + +### The `_is_applying_snapshot` Guard During Reset + +`_handle_reset_session` does *not* set `self._is_applying_snapshot`. This means the change-detection logic *will* push a new snapshot to the undo stack after the reset. The snapshot will contain the *post-reset* state. To get the pre-reset state, the user must Ctrl+Z *twice* (once to push the post-reset snapshot, once to restore the pre-reset snapshot). + +This is a known papercut. The fix is to set `self._is_applying_snapshot = True` during the reset and clear it at the end. See `tests/test_session_logger_reset.py:test_reset_session` for the current behavior. + +### The 2026-06-08 Regression + +`app_controller.py:3307-3312` documents a regression that was caught by `test_context_sim_live`: + +> The test's `client.click("btn_reset")` resets the AI session but does not reset the project (see `_handle_reset_session` at line 3244 — it clears files, context_files, disc_entries, etc. but not self.project or self.active_project_path). + +The fix was to *not* clear `self.active_project_path`. The project dict *is* replaced (`self.project = project_manager.default_project(...)`), but the path is preserved. This is the right behavior: the user is *resetting the session*, not *abandoning the project*. + +--- + +## 4. State Synchronization Across Threads + +The state lifecycle has 4 distinct threads of access: + +1. **Render thread** (60 FPS) — reads `app.` to render ImGui widgets +2. **AI response callback thread** (background) — appends to `app.disc_entries` under `_disc_entries_lock` +3. **Hook API thread** (HTTP server on `127.0.0.1:8999`) — reads via `_gettable_fields` and writes via `_predefined_callbacks` +4. **MMA worker thread(s)** — write to `_api_event_queue`, `_pending_gui_tasks`, `_pending_history_adds` + +The synchronization primitives: + +- `_disc_entries_lock: threading.Lock` — protects `disc_entries` (read by render, written by AI callback and Truncate) +- `_pending_history_adds_lock` — protects the queue of pending discussion entries +- `_api_event_queue_lock` — protects the Hook API event stream +- `_pending_gui_tasks_lock` — protects the queue of GUI tasks scheduled from background threads +- `_project_switch_lock` — protects the project-switch state machine +- `ai_client._send_lock: threading.Lock` — serializes all `ai_client.send()` calls (the global lock per `guide_ai_client.md`) +- `ai_client.__history_lock` — per-provider history lock (one per provider) + +**Invariant:** the render thread *never* blocks. It reads lock-free. All locks are acquired by the writer (AI callback, Truncate button, Hook API), held for the minimum critical section, and released before the writer returns. + +See `docs/reports/MUTATION_MATRIX_PHASE5.md` for the full matrix of state mutations × lock × thread. + +--- + +## 5. State Persistence + +Three layers of state persistence: + +1. **In-memory only** (lost on crash): `ai_client.__history`, `app.disc_entries`, `app.history` (undo stack) +2. **Project TOML** (persisted on Save / switch discussion): `project.discussion.discussions[*].history`, `project.discussion.discussions[*].context_snapshot`, `project.discussion.discussions[*].sent_markdown`, `project.discussion.discussions[*].sent_system_prompt` +3. **Config TOML** (persisted on `save_config()`): `config.disc_roles`, `config.use_default_base_system_prompt`, `config.ui_global_system_prompt`, etc. + +The auto-save flow is: +- `_flush_to_project()` → writes project TOML +- `_flush_to_config()` → writes in-memory config to `AppController.config` +- `save_config()` → calls `models.save_config(config)` to write to disk + +These three calls are *manual* (not automatic). The user must click Save, or trigger a state change that flushes (switch discussion, branch, etc.). The reset path explicitly does *not* save (per the "What reset does NOT touch" section above). + +**The comms log** is its own persistence layer. `ai_client._comms_log` and `app._comms_log` are in-memory, but every entry is also written to `logs/sessions//comms.log` (JSON-L) via the `_on_comms_entry` callback. The reset clears the in-memory log; the on-disk log survives. This is intentional — the comms log is an *audit trail*, not a working state. + +--- + +## 6. Hot Reload Integration + +`src/gui_2.py` is hot-reloadable. The `HotReloader` (covered in `guide_hot_reload.md`) swaps module references at runtime. The state lifecycle interacts with hot-reload in 3 ways: + +1. **`HistoryManager` survives hot-reload** because it lives on the Controller, not the App. The `gui_2.py` module's functions can be reloaded without losing the undo stack. +2. **`UISnapshot` schema is the contract** — if a hot-reload changes the fields captured by `UISnapshot`, the old snapshots in the undo stack may have different shapes. The `from_dict` method handles missing fields via `.get(..., default)`, so old snapshots degrade gracefully. +3. **`_app` back-reference** is set in `App.__init__` (line 264). After a hot-reload of `gui_2.py`, the Controller's `self._app` still points to the (now reloaded) App instance. Render functions that captured `app` in closures may still hold the old App — but the App is a thin wrapper, so the new App is functionally equivalent. + +See `guide_hot_reload.md §"What can/cannot be safely reloaded"` for the full list. + +--- + +## 7. Hook API Surface + +The state lifecycle is exposed to the Hook API via two registries on the Controller (`src/app_controller.py:296-326` in `App.__init__`): + +- **`_predefined_callbacks: dict[str, Callable]`** — name → function. The Hook API exposes each as a `custom_callback` action. Includes: + - `save_context_preset`, `load_context_preset`, `delete_context_preset` + - `set_ui_file_paths`, `set_ui_screenshot_paths` + - `set_context_files_for_test`, `set_screenshots_for_test` + - `_toggle_command_palette` + - `get_app_debug_info`, `save_context_preset_force` + - `set_ui_attr(k, v)`, `set_context_files` + - `simulate_save_preset` +- **`_gettable_fields: dict[str, str]`** — public name → internal field name. The Hook API exposes each as a readable state field. Includes `show_command_palette`, `app_debug_info`, and 50+ other UI/state fields. + +The Hook API does *not* directly expose `disc_entries` mutation; it goes through `/api/session` POST which replaces the full list. + +`/api/session` is the most state-relevant endpoint: +- `GET /api/session` → `{"session": {"entries": [...]}}` +- `POST /api/session` with `{"session": {"entries": [...]}}` → `{"status": "updated"}` + +See `guide_tools.md §"Hook API"` and `guide_api_hooks.md` for the full surface. + +--- + +## 8. Tests + +- `tests/test_history.py` — `test_undo_redo`, `test_jump_to_undo`, `test_max_capacity`, `test_redo_cleared_on_push`, `test_push_state`, `test_initial_state` +- `tests/test_history_manager.py` — `TestHistoryManager` class with: `test_snapshot_roundtrip`, `test_push_and_undo`, `test_push_clears_redo_stack`, `test_undo_and_redo`, `test_undo_no_history_returns_none`, `test_redo_no_history_returns_none`, `test_get_history_returns_descriptions`, `test_jump_to_undo` +- `tests/test_session_logger_reset.py` — `test_reset_session` +- `tests/test_state_inventory.py` — validates the state inventory is up-to-date +- `tests/test_state_delegation.py` — validates `App.__getattr__`/`__setattr__` behavior +- `tests/test_live_gui_state_sync.py` — validates that Hook API state reads are consistent with the live GUI state +- `tests/test_gui_fast_render.py` — `test_render_discussion_panel_fast` (the change-detection path) +- `tests/conftest.py:app_instance`, `tests/conftest.py:mock_app` — the App fixtures used by these tests + +--- + +## 9. Known Limitations + +1. **Per-edit save is not debounced to disk.** See `guide_discussions.md §"Known Limitations"` for the related issue. The fix is to hook the change-detection in `gui_2.py:1140-1170` to also call `_flush_disc_entries_to_project` after a debounce. +2. **`_handle_reset_session` pushes a new snapshot to the undo stack.** The pre-reset state is two Ctrl+Z presses away, not one. Fix: set `self._is_applying_snapshot = True` during the reset. +3. **Provider-side history and Manual Slop state can diverge.** When the user edits an entry's `content` via the discussion UI, the provider's `ai_client.__history` still has the original. This is Pitfall #4 in `conductor/tracks/nagent_review_20260608/report.md` and Decision candidate #3. +4. **Undo stack capacity is 100.** For long sessions with infrequent edits, this is plenty. For a 5-second window of rapid typing, you can fill it. Capacity is set in `app.history = HistoryManager(max_capacity=100)` in `App.__init__` and is not configurable. +5. **Hook API state writes are not undoable.** A `POST /api/session` that replaces the discussion list is a *single change*; if the change-detection logic pushes a snapshot, it's pushed as one entry. The user can Ctrl+Z to revert it, but the API caller has no way to know the change is undoable. + +--- + +## Cross-References + +- **Undo/redo core:** `src/history.py:8 UISnapshot`, `src/history.py:71 HistoryManager` +- **App-side wiring:** `src/gui_2.py:735 _take_snapshot`, `src/gui_2.py:754 _apply_snapshot`, `src/gui_2.py:819 _apply_undo`, `src/gui_2.py:825 _apply_redo`, `src/gui_2.py:832 _apply_jump`, `src/gui_2.py:1140-1170 change_detection`, `src/gui_2.py:666 __getattr__`, `src/gui_2.py:671 __setattr__` +- **Controller-side reset:** `src/app_controller.py:3286 _handle_reset_session`, `src/app_controller.py:3357 _handle_compress_discussion` +- **Hook API registries:** `src/gui_2.py:296-326` (the `_predefined_callbacks` / `_gettable_fields` assignments in `App.__init__`) +- **State inventory:** `docs/reports/STATE_INVENTORY_PHASE5.md`, `docs/reports/MUTATION_MATRIX_PHASE5.md` +- **Discussion integration:** `guide_discussions.md` +- **Actionable patterns:** `conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md §1` (state visibility), §9 (edit-the-input) +- **Future-track candidate for stateless LLMClient:** `conductor/tracks/nagent_review_20260608/decisions.md` candidate #3