docs(phase6): add Advanced Context Curation guide and C/C++ AST tools

This commit is contained in:
2026-05-10 15:48:21 -04:00
parent 760054bb4d
commit 05a11978ef
2 changed files with 305 additions and 0 deletions
+273
View File
@@ -0,0 +1,273 @@
# Advanced Context Curation
[Top](../README.md) | [Architecture](guide_architecture.md) | [Tools & IPC](guide_tools.md) | [MMA](guide_mma.md) | [Simulations](guide_simulations.md)
---
## Overview
Phase 6 introduced three advanced context curation features that enhance the granularity and resilience of file-based context management:
1. **Granular AST Control** — Per-symbol toggling between Definition, Signature, and Hidden states for C/C++ files
2. **Fuzzy Anchor Slices** — Text slice definitions that survive file modifications via anchor-based resolution
3. **Interactive AST Tree Masking** — GUI modal for inspecting and masking AST nodes
---
## Granular AST Control
### Purpose
For C/C++ files, instead of binary "include/exclude", each symbol (class, function, struct) can be set to one of three states:
| State | Description | Use Case |
|-------|-------------|-----------|
| `full` | Include entire file content | Unknown structure, complex macros |
| `def` | Include function/class definitions only | Header inspection |
| `sig` | Include function signatures only | API surface review |
| `agg` | Auto-aggregate via summarization | Token budget management |
| `hide` | Exclude from context entirely | Irrelevant symbols |
### Implementation
The `ast_mask` dictionary on file items tracks per-symbol state:
```python
# src/gui_2.py:_render_context_composition_panel
if f_path.lower().endswith(('.c', '.cpp', '.h', '.hpp', '.cxx', '.cc')):
if hasattr(f_item, 'ast_mask'):
# Show AST state indicators
pass
```
File items expose these properties:
- `force_full`: Override aggregation with full content
- `auto_aggregate`: Use summarization pipeline
- `ast_signatures`: Include signatures only
- `ast_definitions`: Include definitions only
### Data Structure
```python
@dataclass
class FileItem:
path: str
force_full: bool = False
auto_aggregate: bool = False
ast_signatures: bool = False
ast_definitions: bool = False
ast_mask: dict[str, str] = field(default_factory=dict) # symbol_path -> state
```
---
## Fuzzy Anchor Slices
### Purpose
Text slices defined by line numbers become invalid when files are modified (lines inserted/deleted). Fuzzy Anchor slices use content hashing and anchor line matching to resolve the correct position after file changes.
### Algorithm
1. **Create Slice**: When user defines a slice from `start_line` to `end_line`:
- Capture content hash of the region
- Store surrounding context lines (before/after) as anchors
2. **Resolve Slice**: On file re-read after modification:
- Search for anchor content in modified file
- Calculate offset from anchor displacement
- Return new `start_line`, `end_line`
### Implementation
```python
# src/fuzzy_anchor.py
class FuzzyAnchor:
@classmethod
def create_slice(cls, text: str, start_line: int, end_line: int) -> dict:
"""Returns slice_data with content_hash, anchor_lines, and positions."""
@classmethod
def resolve_slice(cls, text: str, slice_data: dict) -> Optional[Tuple[int, int]]:
"""Resolves slice position in modified text, returns (start, end) or None."""
```
### Slice Data Structure
```python
{
"start_line": 10, # 1-based original line
"end_line": 25, # 1-based original line
"content_hash": "abc123...", # SHA256 of region content
"start_context": [...], # Lines before start for anchor matching
"end_context": [...] # Lines after end for anchor matching
}
```
### Anchor Matching Strategy
- **Exact match**: If anchors found at same positions, return original lines
- **Shift detection**: If anchors shifted, calculate delta and apply to slice bounds
- **Mismatch**: If anchors not found, return `None` (slice definition invalid)
---
## Interactive AST Tree Masking
### Purpose
The AST Inspector modal allows visual inspection of a file's parsed structure and per-symbol state control.
### Modal Flow
1. User right-clicks a C/C++ file in Context Panel
2. Selects "Inspect AST" from context menu
3. Modal opens showing hierarchical tree of all symbols
4. Per-symbol radio buttons (Def/Sig/Hide) control state
5. Changes persist to `ast_mask` dictionary
### Implementation
```python
# src/gui_2.py:_render_ast_inspector_modal
def _render_ast_inspector_modal(self) -> None:
expanded, opened = imgui.begin_popup_modal('AST Inspector', True, ...)
if expanded:
# Fetch outline via tree-sitter MCP tools
outline = mcp_client.ts_cpp_get_code_outline(f_path)
# Parse into hierarchical node list
for node in parsed_nodes:
# Render [Kind] Name with radio buttons
if imgui.radio_button("Def", current_mode == 'def'):
f_item.ast_mask[full_path] = 'def'
```
### Node Display Format
```
[Struct] MyClass (Lines 10-50)
[Field] member1 (Lines 12-14)
[Method] init (Lines 20-30)
```
Radio buttons per node:
- **Def**: Include this symbol's definition
- **Sig**: Include this symbol's signature only
- **Hide**: Exclude this symbol entirely
---
## Batch Operations
### Shift-Click Range Selection
The Context Panel supports Shift-Click for range selection:
```python
# src/gui_2.py:_render_context_composition_panel
if changed_sel:
if imgui.get_io().key_shift and self._last_selected_context_index != -1:
start = min(self._last_selected_context_index, i)
end = max(self._last_selected_context_index, i)
for idx in range(start, end + 1):
# Toggle selection state for range
pass
```
### Batch Action Bar
Batch operations apply to all selected files:
| Button | Action |
|--------|--------|
| Full | Set `force_full=True` for all selected |
| Agg | Set `auto_aggregate=True` for all selected |
| Sig | Set `ast_signatures=True` for all selected |
| Def | Set `ast_definitions=True` for all selected |
| Remove | Remove selected files from context |
---
## Context Snapshotting (Per-Take)
### Purpose
When switching between discussion "takes", the context panel state is snapshotted and restored.
### UISnapshot Structure
```python
@dataclass
class UISnapshot:
ai_input: str
project_system_prompt: str
global_system_prompt: str
base_system_prompt: str
use_default_base_prompt: bool
temperature: float
top_p: float
max_tokens: int
auto_add_history: bool
disc_entries: list[dict]
files: list[dict]
screenshots: list[str]
```
### HistoryManager Integration
```python
class HistoryManager:
def push(self, state: Any, description: str) -> None: ...
def undo(self, current_state: Any, ...) -> Optional[HistoryEntry]: ...
def redo(self, current_state: Any, ...) -> Optional[HistoryEntry]: ...
def jump_to_undo(self, index: int, current_state: Any, ...) -> Optional[HistoryEntry]: ...
```
---
## Aggregation Pipeline Integration
The context curation features integrate with the aggregation pipeline:
```python
# src/aggregate.py
def _build_file_item_context(self, f_item: FileItem, ...) -> str:
if f_item.ast_mask:
# Apply AST masking before aggregation
masked_content = self._apply_ast_mask(content, f_item.ast_mask)
```
### Mask Application Order
1. Fetch file content
2. Parse AST if C/C++ file
3. Apply `ast_mask` per symbol
4. Run through aggregation strategy (full/agg/sig/def/hide)
5. Return masked, aggregated content
---
## Testing
### Unit Tests
- `tests/test_fuzzy_anchor.py` — FuzzyAnchor.create_slice/resolve_slice
- `tests/test_history_manager.py` — HistoryManager undo/redo/snapshot
- `tests/test_ts_cpp_tools.py` — C++ skeleton/outline/definition tools
- `tests/test_ast_parser.py` — ASTParser for Python/C/C++
### Simulation Tests
- `tests/test_phase6_simulation.py` — GUI integration tests
- Batch operations shift-click
- AST Inspector modal
- Slice editor
### Full Suite
```bash
uv run pytest tests/test_fuzzy_anchor.py tests/test_history_manager.py \
tests/test_ts_cpp_tools.py tests/test_ast_parser.py \
tests/test_phase6_simulation.py -v
```
+32
View File
@@ -88,6 +88,38 @@ These use `file_cache.ASTParser` (tree-sitter) or stdlib `ast` for structural co
| `py_get_hierarchy` | `path`, `class_name` | Scans directory for subclasses of a given class. | | `py_get_hierarchy` | `path`, `class_name` | Scans directory for subclasses of a given class. |
| `py_get_docstring` | `path`, `name` | Extracts docstring for module, class, or function. | | `py_get_docstring` | `path`, `name` | Extracts docstring for module, class, or function. |
### C/C++ AST Tools
These use `tree_sitter` via `src/mcp_client.py` for structural analysis of C and C++ codebases. Phase 6 added these tools to support the Granular AST Control feature.
| Tool | Parameters | Description |
|---|---|---|
| `ts_c_get_skeleton` | `path` | C/C++ function signatures and struct definitions, bodies replaced with `...`. |
| `ts_cpp_get_skeleton` | `path` | C++ class/struct signatures, method signatures, and inheritance info. |
| `ts_c_get_code_outline` | `path` | Hierarchical C outline: `[Struct] Name (Lines X-Y)` with nested members. |
| `ts_cpp_get_code_outline` | `path` | Hierarchical C++ outline with classes, methods, inheritance hierarchy. |
| `ts_c_get_definition` | `path`, `name` | Full source of a specific C struct or function. |
| `ts_cpp_get_definition` | `path`, `name` | Full source of a specific C++ class, struct, or method. Supports `ClassName::method` notation. |
| `ts_c_update_definition` | `path`, `name`, `new_content` | Surgical replacement for C definitions. |
| `ts_cpp_update_definition` | `path`, `name`, `new_content` | Surgical replacement for C++ definitions. |
| `ts_c_get_signature` | `path`, `name` | Only the function/struct declaration line. |
| `ts_cpp_get_signature` | `path`, `name` | Only the method/function declaration line. |
**Usage for Context Curation:**
```python
# Fetch outline for AST inspection modal
outline = mcp_client.ts_cpp_get_code_outline("path/to/file.hpp")
# Fetch specific definition for masked inclusion
defn = mcp_client.ts_cpp_get_definition("path/to/file.hpp", "MyClass::init")
# Apply per-symbol masking via FuzzyAnchor
from src.fuzzy_anchor import FuzzyAnchor
slice_data = FuzzyAnchor.create_slice(content, start_line, end_line)
resolved = FuzzyAnchor.resolve_slice(modified_content, slice_data)
```
### Analysis Tools ### Analysis Tools
| Tool | Parameters | Description | | Tool | Parameters | Description |