From 05a11978efa576e8428a0e19cae0ebb4dc8a481b Mon Sep 17 00:00:00 2001 From: Ed_ Date: Sun, 10 May 2026 15:48:21 -0400 Subject: [PATCH] docs(phase6): add Advanced Context Curation guide and C/C++ AST tools --- docs/guide_context_curation.md | 273 +++++++++++++++++++++++++++++++++ docs/guide_tools.md | 32 ++++ 2 files changed, 305 insertions(+) create mode 100644 docs/guide_context_curation.md diff --git a/docs/guide_context_curation.md b/docs/guide_context_curation.md new file mode 100644 index 0000000..21c3caf --- /dev/null +++ b/docs/guide_context_curation.md @@ -0,0 +1,273 @@ +# Advanced Context Curation + +[Top](../README.md) | [Architecture](guide_architecture.md) | [Tools & IPC](guide_tools.md) | [MMA](guide_mma.md) | [Simulations](guide_simulations.md) + +--- + +## Overview + +Phase 6 introduced three advanced context curation features that enhance the granularity and resilience of file-based context management: + +1. **Granular AST Control** — Per-symbol toggling between Definition, Signature, and Hidden states for C/C++ files +2. **Fuzzy Anchor Slices** — Text slice definitions that survive file modifications via anchor-based resolution +3. **Interactive AST Tree Masking** — GUI modal for inspecting and masking AST nodes + +--- + +## Granular AST Control + +### Purpose + +For C/C++ files, instead of binary "include/exclude", each symbol (class, function, struct) can be set to one of three states: + +| State | Description | Use Case | +|-------|-------------|-----------| +| `full` | Include entire file content | Unknown structure, complex macros | +| `def` | Include function/class definitions only | Header inspection | +| `sig` | Include function signatures only | API surface review | +| `agg` | Auto-aggregate via summarization | Token budget management | +| `hide` | Exclude from context entirely | Irrelevant symbols | + +### Implementation + +The `ast_mask` dictionary on file items tracks per-symbol state: + +```python +# src/gui_2.py:_render_context_composition_panel +if f_path.lower().endswith(('.c', '.cpp', '.h', '.hpp', '.cxx', '.cc')): + if hasattr(f_item, 'ast_mask'): + # Show AST state indicators + pass +``` + +File items expose these properties: +- `force_full`: Override aggregation with full content +- `auto_aggregate`: Use summarization pipeline +- `ast_signatures`: Include signatures only +- `ast_definitions`: Include definitions only + +### Data Structure + +```python +@dataclass +class FileItem: + path: str + force_full: bool = False + auto_aggregate: bool = False + ast_signatures: bool = False + ast_definitions: bool = False + ast_mask: dict[str, str] = field(default_factory=dict) # symbol_path -> state +``` + +--- + +## Fuzzy Anchor Slices + +### Purpose + +Text slices defined by line numbers become invalid when files are modified (lines inserted/deleted). Fuzzy Anchor slices use content hashing and anchor line matching to resolve the correct position after file changes. + +### Algorithm + +1. **Create Slice**: When user defines a slice from `start_line` to `end_line`: + - Capture content hash of the region + - Store surrounding context lines (before/after) as anchors + +2. **Resolve Slice**: On file re-read after modification: + - Search for anchor content in modified file + - Calculate offset from anchor displacement + - Return new `start_line`, `end_line` + +### Implementation + +```python +# src/fuzzy_anchor.py +class FuzzyAnchor: + @classmethod + def create_slice(cls, text: str, start_line: int, end_line: int) -> dict: + """Returns slice_data with content_hash, anchor_lines, and positions.""" + + @classmethod + def resolve_slice(cls, text: str, slice_data: dict) -> Optional[Tuple[int, int]]: + """Resolves slice position in modified text, returns (start, end) or None.""" +``` + +### Slice Data Structure + +```python +{ + "start_line": 10, # 1-based original line + "end_line": 25, # 1-based original line + "content_hash": "abc123...", # SHA256 of region content + "start_context": [...], # Lines before start for anchor matching + "end_context": [...] # Lines after end for anchor matching +} +``` + +### Anchor Matching Strategy + +- **Exact match**: If anchors found at same positions, return original lines +- **Shift detection**: If anchors shifted, calculate delta and apply to slice bounds +- **Mismatch**: If anchors not found, return `None` (slice definition invalid) + +--- + +## Interactive AST Tree Masking + +### Purpose + +The AST Inspector modal allows visual inspection of a file's parsed structure and per-symbol state control. + +### Modal Flow + +1. User right-clicks a C/C++ file in Context Panel +2. Selects "Inspect AST" from context menu +3. Modal opens showing hierarchical tree of all symbols +4. Per-symbol radio buttons (Def/Sig/Hide) control state +5. Changes persist to `ast_mask` dictionary + +### Implementation + +```python +# src/gui_2.py:_render_ast_inspector_modal +def _render_ast_inspector_modal(self) -> None: + expanded, opened = imgui.begin_popup_modal('AST Inspector', True, ...) + if expanded: + # Fetch outline via tree-sitter MCP tools + outline = mcp_client.ts_cpp_get_code_outline(f_path) + + # Parse into hierarchical node list + for node in parsed_nodes: + # Render [Kind] Name with radio buttons + if imgui.radio_button("Def", current_mode == 'def'): + f_item.ast_mask[full_path] = 'def' +``` + +### Node Display Format + +``` +[Struct] MyClass (Lines 10-50) + [Field] member1 (Lines 12-14) + [Method] init (Lines 20-30) +``` + +Radio buttons per node: +- **Def**: Include this symbol's definition +- **Sig**: Include this symbol's signature only +- **Hide**: Exclude this symbol entirely + +--- + +## Batch Operations + +### Shift-Click Range Selection + +The Context Panel supports Shift-Click for range selection: + +```python +# src/gui_2.py:_render_context_composition_panel +if changed_sel: + if imgui.get_io().key_shift and self._last_selected_context_index != -1: + start = min(self._last_selected_context_index, i) + end = max(self._last_selected_context_index, i) + for idx in range(start, end + 1): + # Toggle selection state for range + pass +``` + +### Batch Action Bar + +Batch operations apply to all selected files: + +| Button | Action | +|--------|--------| +| Full | Set `force_full=True` for all selected | +| Agg | Set `auto_aggregate=True` for all selected | +| Sig | Set `ast_signatures=True` for all selected | +| Def | Set `ast_definitions=True` for all selected | +| Remove | Remove selected files from context | + +--- + +## Context Snapshotting (Per-Take) + +### Purpose + +When switching between discussion "takes", the context panel state is snapshotted and restored. + +### UISnapshot Structure + +```python +@dataclass +class UISnapshot: + ai_input: str + project_system_prompt: str + global_system_prompt: str + base_system_prompt: str + use_default_base_prompt: bool + temperature: float + top_p: float + max_tokens: int + auto_add_history: bool + disc_entries: list[dict] + files: list[dict] + screenshots: list[str] +``` + +### HistoryManager Integration + +```python +class HistoryManager: + def push(self, state: Any, description: str) -> None: ... + def undo(self, current_state: Any, ...) -> Optional[HistoryEntry]: ... + def redo(self, current_state: Any, ...) -> Optional[HistoryEntry]: ... + def jump_to_undo(self, index: int, current_state: Any, ...) -> Optional[HistoryEntry]: ... +``` + +--- + +## Aggregation Pipeline Integration + +The context curation features integrate with the aggregation pipeline: + +```python +# src/aggregate.py +def _build_file_item_context(self, f_item: FileItem, ...) -> str: + if f_item.ast_mask: + # Apply AST masking before aggregation + masked_content = self._apply_ast_mask(content, f_item.ast_mask) +``` + +### Mask Application Order + +1. Fetch file content +2. Parse AST if C/C++ file +3. Apply `ast_mask` per symbol +4. Run through aggregation strategy (full/agg/sig/def/hide) +5. Return masked, aggregated content + +--- + +## Testing + +### Unit Tests + +- `tests/test_fuzzy_anchor.py` — FuzzyAnchor.create_slice/resolve_slice +- `tests/test_history_manager.py` — HistoryManager undo/redo/snapshot +- `tests/test_ts_cpp_tools.py` — C++ skeleton/outline/definition tools +- `tests/test_ast_parser.py` — ASTParser for Python/C/C++ + +### Simulation Tests + +- `tests/test_phase6_simulation.py` — GUI integration tests + - Batch operations shift-click + - AST Inspector modal + - Slice editor + +### Full Suite + +```bash +uv run pytest tests/test_fuzzy_anchor.py tests/test_history_manager.py \ + tests/test_ts_cpp_tools.py tests/test_ast_parser.py \ + tests/test_phase6_simulation.py -v +``` diff --git a/docs/guide_tools.md b/docs/guide_tools.md index 2246fa6..fbe5e59 100644 --- a/docs/guide_tools.md +++ b/docs/guide_tools.md @@ -88,6 +88,38 @@ These use `file_cache.ASTParser` (tree-sitter) or stdlib `ast` for structural co | `py_get_hierarchy` | `path`, `class_name` | Scans directory for subclasses of a given class. | | `py_get_docstring` | `path`, `name` | Extracts docstring for module, class, or function. | +### C/C++ AST Tools + +These use `tree_sitter` via `src/mcp_client.py` for structural analysis of C and C++ codebases. Phase 6 added these tools to support the Granular AST Control feature. + +| Tool | Parameters | Description | +|---|---|---| +| `ts_c_get_skeleton` | `path` | C/C++ function signatures and struct definitions, bodies replaced with `...`. | +| `ts_cpp_get_skeleton` | `path` | C++ class/struct signatures, method signatures, and inheritance info. | +| `ts_c_get_code_outline` | `path` | Hierarchical C outline: `[Struct] Name (Lines X-Y)` with nested members. | +| `ts_cpp_get_code_outline` | `path` | Hierarchical C++ outline with classes, methods, inheritance hierarchy. | +| `ts_c_get_definition` | `path`, `name` | Full source of a specific C struct or function. | +| `ts_cpp_get_definition` | `path`, `name` | Full source of a specific C++ class, struct, or method. Supports `ClassName::method` notation. | +| `ts_c_update_definition` | `path`, `name`, `new_content` | Surgical replacement for C definitions. | +| `ts_cpp_update_definition` | `path`, `name`, `new_content` | Surgical replacement for C++ definitions. | +| `ts_c_get_signature` | `path`, `name` | Only the function/struct declaration line. | +| `ts_cpp_get_signature` | `path`, `name` | Only the method/function declaration line. | + +**Usage for Context Curation:** + +```python +# Fetch outline for AST inspection modal +outline = mcp_client.ts_cpp_get_code_outline("path/to/file.hpp") + +# Fetch specific definition for masked inclusion +defn = mcp_client.ts_cpp_get_definition("path/to/file.hpp", "MyClass::init") + +# Apply per-symbol masking via FuzzyAnchor +from src.fuzzy_anchor import FuzzyAnchor +slice_data = FuzzyAnchor.create_slice(content, start_line, end_line) +resolved = FuzzyAnchor.resolve_slice(modified_content, slice_data) +``` + ### Analysis Tools | Tool | Parameters | Description |