274 lines
7.9 KiB
Markdown
274 lines
7.9 KiB
Markdown
# Advanced Context Curation
|
|
|
|
[Top](../README.md) | [Architecture](guide_architecture.md) | [Tools & IPC](guide_tools.md) | [MMA](guide_mma.md) | [Simulations](guide_simulations.md)
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Phase 6 introduced three advanced context curation features that enhance the granularity and resilience of file-based context management:
|
|
|
|
1. **Granular AST Control** — Per-symbol toggling between Definition, Signature, and Hidden states for C/C++ files
|
|
2. **Fuzzy Anchor Slices** — Text slice definitions that survive file modifications via anchor-based resolution
|
|
3. **Interactive AST Tree Masking** — GUI modal for inspecting and masking AST nodes
|
|
|
|
---
|
|
|
|
## Granular AST Control
|
|
|
|
### Purpose
|
|
|
|
For C/C++ files, instead of binary "include/exclude", each symbol (class, function, struct) can be set to one of three states:
|
|
|
|
| State | Description | Use Case |
|
|
|-------|-------------|-----------|
|
|
| `full` | Include entire file content | Unknown structure, complex macros |
|
|
| `def` | Include function/class definitions only | Header inspection |
|
|
| `sig` | Include function signatures only | API surface review |
|
|
| `agg` | Auto-aggregate via summarization | Token budget management |
|
|
| `hide` | Exclude from context entirely | Irrelevant symbols |
|
|
|
|
### Implementation
|
|
|
|
The `ast_mask` dictionary on file items tracks per-symbol state:
|
|
|
|
```python
|
|
# src/gui_2.py:_render_context_composition_panel
|
|
if f_path.lower().endswith(('.c', '.cpp', '.h', '.hpp', '.cxx', '.cc')):
|
|
if hasattr(f_item, 'ast_mask'):
|
|
# Show AST state indicators
|
|
pass
|
|
```
|
|
|
|
File items expose these properties:
|
|
- `force_full`: Override aggregation with full content
|
|
- `auto_aggregate`: Use summarization pipeline
|
|
- `ast_signatures`: Include signatures only
|
|
- `ast_definitions`: Include definitions only
|
|
|
|
### Data Structure
|
|
|
|
```python
|
|
@dataclass
|
|
class FileItem:
|
|
path: str
|
|
force_full: bool = False
|
|
auto_aggregate: bool = False
|
|
ast_signatures: bool = False
|
|
ast_definitions: bool = False
|
|
ast_mask: dict[str, str] = field(default_factory=dict) # symbol_path -> state
|
|
```
|
|
|
|
---
|
|
|
|
## Fuzzy Anchor Slices
|
|
|
|
### Purpose
|
|
|
|
Text slices defined by line numbers become invalid when files are modified (lines inserted/deleted). Fuzzy Anchor slices use content hashing and anchor line matching to resolve the correct position after file changes.
|
|
|
|
### Algorithm
|
|
|
|
1. **Create Slice**: When user defines a slice from `start_line` to `end_line`:
|
|
- Capture content hash of the region
|
|
- Store surrounding context lines (before/after) as anchors
|
|
|
|
2. **Resolve Slice**: On file re-read after modification:
|
|
- Search for anchor content in modified file
|
|
- Calculate offset from anchor displacement
|
|
- Return new `start_line`, `end_line`
|
|
|
|
### Implementation
|
|
|
|
```python
|
|
# src/fuzzy_anchor.py
|
|
class FuzzyAnchor:
|
|
@classmethod
|
|
def create_slice(cls, text: str, start_line: int, end_line: int) -> dict:
|
|
"""Returns slice_data with content_hash, anchor_lines, and positions."""
|
|
|
|
@classmethod
|
|
def resolve_slice(cls, text: str, slice_data: dict) -> Optional[Tuple[int, int]]:
|
|
"""Resolves slice position in modified text, returns (start, end) or None."""
|
|
```
|
|
|
|
### Slice Data Structure
|
|
|
|
```python
|
|
{
|
|
"start_line": 10, # 1-based original line
|
|
"end_line": 25, # 1-based original line
|
|
"content_hash": "abc123...", # SHA256 of region content
|
|
"start_context": [...], # Lines before start for anchor matching
|
|
"end_context": [...] # Lines after end for anchor matching
|
|
}
|
|
```
|
|
|
|
### Anchor Matching Strategy
|
|
|
|
- **Exact match**: If anchors found at same positions, return original lines
|
|
- **Shift detection**: If anchors shifted, calculate delta and apply to slice bounds
|
|
- **Mismatch**: If anchors not found, return `None` (slice definition invalid)
|
|
|
|
---
|
|
|
|
## Interactive AST Tree Masking
|
|
|
|
### Purpose
|
|
|
|
The AST Inspector modal allows visual inspection of a file's parsed structure and per-symbol state control.
|
|
|
|
### Modal Flow
|
|
|
|
1. User right-clicks a C/C++ file in Context Panel
|
|
2. Selects "Inspect AST" from context menu
|
|
3. Modal opens showing hierarchical tree of all symbols
|
|
4. Per-symbol radio buttons (Def/Sig/Hide) control state
|
|
5. Changes persist to `ast_mask` dictionary
|
|
|
|
### Implementation
|
|
|
|
```python
|
|
# src/gui_2.py:_render_ast_inspector_modal
|
|
def _render_ast_inspector_modal(self) -> None:
|
|
expanded, opened = imgui.begin_popup_modal('AST Inspector', True, ...)
|
|
if expanded:
|
|
# Fetch outline via tree-sitter MCP tools
|
|
outline = mcp_client.ts_cpp_get_code_outline(f_path)
|
|
|
|
# Parse into hierarchical node list
|
|
for node in parsed_nodes:
|
|
# Render [Kind] Name with radio buttons
|
|
if imgui.radio_button("Def", current_mode == 'def'):
|
|
f_item.ast_mask[full_path] = 'def'
|
|
```
|
|
|
|
### Node Display Format
|
|
|
|
```
|
|
[Struct] MyClass (Lines 10-50)
|
|
[Field] member1 (Lines 12-14)
|
|
[Method] init (Lines 20-30)
|
|
```
|
|
|
|
Radio buttons per node:
|
|
- **Def**: Include this symbol's definition
|
|
- **Sig**: Include this symbol's signature only
|
|
- **Hide**: Exclude this symbol entirely
|
|
|
|
---
|
|
|
|
## Batch Operations
|
|
|
|
### Shift-Click Range Selection
|
|
|
|
The Context Panel supports Shift-Click for range selection:
|
|
|
|
```python
|
|
# src/gui_2.py:_render_context_composition_panel
|
|
if changed_sel:
|
|
if imgui.get_io().key_shift and self._last_selected_context_index != -1:
|
|
start = min(self._last_selected_context_index, i)
|
|
end = max(self._last_selected_context_index, i)
|
|
for idx in range(start, end + 1):
|
|
# Toggle selection state for range
|
|
pass
|
|
```
|
|
|
|
### Batch Action Bar
|
|
|
|
Batch operations apply to all selected files:
|
|
|
|
| Button | Action |
|
|
|--------|--------|
|
|
| Full | Set `force_full=True` for all selected |
|
|
| Agg | Set `auto_aggregate=True` for all selected |
|
|
| Sig | Set `ast_signatures=True` for all selected |
|
|
| Def | Set `ast_definitions=True` for all selected |
|
|
| Remove | Remove selected files from context |
|
|
|
|
---
|
|
|
|
## Context Snapshotting (Per-Take)
|
|
|
|
### Purpose
|
|
|
|
When switching between discussion "takes", the context panel state is snapshotted and restored.
|
|
|
|
### UISnapshot Structure
|
|
|
|
```python
|
|
@dataclass
|
|
class UISnapshot:
|
|
ai_input: str
|
|
project_system_prompt: str
|
|
global_system_prompt: str
|
|
base_system_prompt: str
|
|
use_default_base_prompt: bool
|
|
temperature: float
|
|
top_p: float
|
|
max_tokens: int
|
|
auto_add_history: bool
|
|
disc_entries: list[dict]
|
|
files: list[dict]
|
|
screenshots: list[str]
|
|
```
|
|
|
|
### HistoryManager Integration
|
|
|
|
```python
|
|
class HistoryManager:
|
|
def push(self, state: Any, description: str) -> None: ...
|
|
def undo(self, current_state: Any, ...) -> Optional[HistoryEntry]: ...
|
|
def redo(self, current_state: Any, ...) -> Optional[HistoryEntry]: ...
|
|
def jump_to_undo(self, index: int, current_state: Any, ...) -> Optional[HistoryEntry]: ...
|
|
```
|
|
|
|
---
|
|
|
|
## Aggregation Pipeline Integration
|
|
|
|
The context curation features integrate with the aggregation pipeline:
|
|
|
|
```python
|
|
# src/aggregate.py
|
|
def _build_file_item_context(self, f_item: FileItem, ...) -> str:
|
|
if f_item.ast_mask:
|
|
# Apply AST masking before aggregation
|
|
masked_content = self._apply_ast_mask(content, f_item.ast_mask)
|
|
```
|
|
|
|
### Mask Application Order
|
|
|
|
1. Fetch file content
|
|
2. Parse AST if C/C++ file
|
|
3. Apply `ast_mask` per symbol
|
|
4. Run through aggregation strategy (full/agg/sig/def/hide)
|
|
5. Return masked, aggregated content
|
|
|
|
---
|
|
|
|
## Testing
|
|
|
|
### Unit Tests
|
|
|
|
- `tests/test_fuzzy_anchor.py` — FuzzyAnchor.create_slice/resolve_slice
|
|
- `tests/test_history_manager.py` — HistoryManager undo/redo/snapshot
|
|
- `tests/test_ts_cpp_tools.py` — C++ skeleton/outline/definition tools
|
|
- `tests/test_ast_parser.py` — ASTParser for Python/C/C++
|
|
|
|
### Simulation Tests
|
|
|
|
- `tests/test_phase6_simulation.py` — GUI integration tests
|
|
- Batch operations shift-click
|
|
- AST Inspector modal
|
|
- Slice editor
|
|
|
|
### Full Suite
|
|
|
|
```bash
|
|
uv run pytest tests/test_fuzzy_anchor.py tests/test_history_manager.py \
|
|
tests/test_ts_cpp_tools.py tests/test_ast_parser.py \
|
|
tests/test_phase6_simulation.py -v
|
|
```
|