Files
manual_slop/docs/guide_context_curation.md
T

7.9 KiB

Advanced Context Curation

Top | Architecture | Tools & IPC | MMA | Simulations


Overview

Phase 6 introduced three advanced context curation features that enhance the granularity and resilience of file-based context management:

  1. Granular AST Control — Per-symbol toggling between Definition, Signature, and Hidden states for C/C++ files
  2. Fuzzy Anchor Slices — Text slice definitions that survive file modifications via anchor-based resolution
  3. Interactive AST Tree Masking — GUI modal for inspecting and masking AST nodes

Granular AST Control

Purpose

For C/C++ files, instead of binary "include/exclude", each symbol (class, function, struct) can be set to one of three states:

State Description Use Case
full Include entire file content Unknown structure, complex macros
def Include function/class definitions only Header inspection
sig Include function signatures only API surface review
agg Auto-aggregate via summarization Token budget management
hide Exclude from context entirely Irrelevant symbols

Implementation

The ast_mask dictionary on file items tracks per-symbol state:

# src/gui_2.py:_render_context_composition_panel
if f_path.lower().endswith(('.c', '.cpp', '.h', '.hpp', '.cxx', '.cc')):
    if hasattr(f_item, 'ast_mask'):
        # Show AST state indicators
        pass

File items expose these properties:

  • force_full: Override aggregation with full content
  • auto_aggregate: Use summarization pipeline
  • ast_signatures: Include signatures only
  • ast_definitions: Include definitions only

Data Structure

@dataclass
class FileItem:
    path: str
    force_full: bool = False
    auto_aggregate: bool = False
    ast_signatures: bool = False
    ast_definitions: bool = False
    ast_mask: dict[str, str] = field(default_factory=dict)  # symbol_path -> state

Fuzzy Anchor Slices

Purpose

Text slices defined by line numbers become invalid when files are modified (lines inserted/deleted). Fuzzy Anchor slices use content hashing and anchor line matching to resolve the correct position after file changes.

Algorithm

  1. Create Slice: When user defines a slice from start_line to end_line:

    • Capture content hash of the region
    • Store surrounding context lines (before/after) as anchors
  2. Resolve Slice: On file re-read after modification:

    • Search for anchor content in modified file
    • Calculate offset from anchor displacement
    • Return new start_line, end_line

Implementation

# src/fuzzy_anchor.py
class FuzzyAnchor:
    @classmethod
    def create_slice(cls, text: str, start_line: int, end_line: int) -> dict:
        """Returns slice_data with content_hash, anchor_lines, and positions."""

    @classmethod
    def resolve_slice(cls, text: str, slice_data: dict) -> Optional[Tuple[int, int]]:
        """Resolves slice position in modified text, returns (start, end) or None."""

Slice Data Structure

{
    "start_line": 10,      # 1-based original line
    "end_line": 25,        # 1-based original line
    "content_hash": "abc123...",  # SHA256 of region content
    "start_context": [...], # Lines before start for anchor matching
    "end_context": [...]   # Lines after end for anchor matching
}

Anchor Matching Strategy

  • Exact match: If anchors found at same positions, return original lines
  • Shift detection: If anchors shifted, calculate delta and apply to slice bounds
  • Mismatch: If anchors not found, return None (slice definition invalid)

Interactive AST Tree Masking

Purpose

The AST Inspector modal allows visual inspection of a file's parsed structure and per-symbol state control.

Modal Flow

  1. User right-clicks a C/C++ file in Context Panel
  2. Selects "Inspect AST" from context menu
  3. Modal opens showing hierarchical tree of all symbols
  4. Per-symbol radio buttons (Def/Sig/Hide) control state
  5. Changes persist to ast_mask dictionary

Implementation

# src/gui_2.py:_render_ast_inspector_modal
def _render_ast_inspector_modal(self) -> None:
    expanded, opened = imgui.begin_popup_modal('AST Inspector', True, ...)
    if expanded:
        # Fetch outline via tree-sitter MCP tools
        outline = mcp_client.ts_cpp_get_code_outline(f_path)

        # Parse into hierarchical node list
        for node in parsed_nodes:
            # Render [Kind] Name with radio buttons
            if imgui.radio_button("Def", current_mode == 'def'):
                f_item.ast_mask[full_path] = 'def'

Node Display Format

[Struct] MyClass (Lines 10-50)
    [Field] member1 (Lines 12-14)
    [Method] init (Lines 20-30)

Radio buttons per node:

  • Def: Include this symbol's definition
  • Sig: Include this symbol's signature only
  • Hide: Exclude this symbol entirely

Batch Operations

Shift-Click Range Selection

The Context Panel supports Shift-Click for range selection:

# src/gui_2.py:_render_context_composition_panel
if changed_sel:
    if imgui.get_io().key_shift and self._last_selected_context_index != -1:
        start = min(self._last_selected_context_index, i)
        end = max(self._last_selected_context_index, i)
        for idx in range(start, end + 1):
            # Toggle selection state for range
            pass

Batch Action Bar

Batch operations apply to all selected files:

Button Action
Full Set force_full=True for all selected
Agg Set auto_aggregate=True for all selected
Sig Set ast_signatures=True for all selected
Def Set ast_definitions=True for all selected
Remove Remove selected files from context

Context Snapshotting (Per-Take)

Purpose

When switching between discussion "takes", the context panel state is snapshotted and restored.

UISnapshot Structure

@dataclass
class UISnapshot:
    ai_input: str
    project_system_prompt: str
    global_system_prompt: str
    base_system_prompt: str
    use_default_base_prompt: bool
    temperature: float
    top_p: float
    max_tokens: int
    auto_add_history: bool
    disc_entries: list[dict]
    files: list[dict]
    screenshots: list[str]

HistoryManager Integration

class HistoryManager:
    def push(self, state: Any, description: str) -> None: ...
    def undo(self, current_state: Any, ...) -> Optional[HistoryEntry]: ...
    def redo(self, current_state: Any, ...) -> Optional[HistoryEntry]: ...
    def jump_to_undo(self, index: int, current_state: Any, ...) -> Optional[HistoryEntry]: ...

Aggregation Pipeline Integration

The context curation features integrate with the aggregation pipeline:

# src/aggregate.py
def _build_file_item_context(self, f_item: FileItem, ...) -> str:
    if f_item.ast_mask:
        # Apply AST masking before aggregation
        masked_content = self._apply_ast_mask(content, f_item.ast_mask)

Mask Application Order

  1. Fetch file content
  2. Parse AST if C/C++ file
  3. Apply ast_mask per symbol
  4. Run through aggregation strategy (full/agg/sig/def/hide)
  5. Return masked, aggregated content

Testing

Unit Tests

  • tests/test_fuzzy_anchor.py — FuzzyAnchor.create_slice/resolve_slice
  • tests/test_history_manager.py — HistoryManager undo/redo/snapshot
  • tests/test_ts_cpp_tools.py — C++ skeleton/outline/definition tools
  • tests/test_ast_parser.py — ASTParser for Python/C/C++

Simulation Tests

  • tests/test_phase6_simulation.py — GUI integration tests
    • Batch operations shift-click
    • AST Inspector modal
    • Slice editor

Full Suite

uv run pytest tests/test_fuzzy_anchor.py tests/test_history_manager.py \
    tests/test_ts_cpp_tools.py tests/test_ast_parser.py \
    tests/test_phase6_simulation.py -v