# Specification: Smarter Aggregation with Sub-Agent Summarization

## 1. Overview

This track improves the context aggregation system to use sub-agent passes for intelligent summarization and hash-based caching to avoid redundant work.

**Current Problem:**
- Aggregation is a simple pass that either injects full file content or a basic skeleton
- No intelligence applied to determine what level of detail is needed
- Same files get re-summarized on every discussion start even if unchanged

**Goal:**
- Use a sub-agent during aggregation pass for high-tier agents to generate succinct summaries
- Cache summaries based on file hash - only re-summarize if file changed
- Smart outline generation for code files, summary for text files

## 2. Current State Audit

### Existing Aggregation Behavior
- `aggregate.py` handles context aggregation
- `file_cache.py` provides AST parsing and skeleton generation
- Per-file flags: `Auto-Aggregate` (summarize), `Force Full` (inject raw)
- No caching of summarization results

### Provider API Considerations
- Different providers have different prompt/caching mechanisms
- Need to verify how each provider handles system context and caching
- May need provider-specific aggregation strategies

## 3. Functional Requirements

### 3.1 Hash-Based Summary Cache
- Generate SHA256 hash of file content
- Store summaries in a cache (file-based or in project state)
- Before summarizing, check if file hash matches cached summary
- Cache invalidation when file content changes

### 3.2 Sub-Agent Summarization Pass
- During aggregation, optionally invoke sub-agent for summarization
- Sub-agent generates concise summary of file purpose and key points
- Different strategies for:
  - Code files: AST-based outline + key function signatures
  - Text files: Paragraph-level summary
  - Config files: Key-value extraction

### 3.3 Tiered Aggregation Strategy
- Tier 3/4 workers: Get skeleton outlines (fast, cheap)
- Tier 2 (Tech Lead): Get summaries with key details
- Tier 1 (Orchestrator): May get full content or enhanced summaries
- Configurable per-agent via Persona

### 3.4 Cache Persistence
- Summaries persist across sessions
- Stored in project directory or centralized cache location
- Manual cache clear option in UI

## 4. Data Model

### 4.1 Summary Cache Entry
```python
{
    "file_path": str,
    "file_hash": str,  # SHA256 of content
    "summary": str,
    "outline": str,  # For code files
    "generated_at": str,  # ISO timestamp
    "generator_tier": str,  # Which tier generated it
}
```

### 4.2 Aggregation Config
```toml
[aggregation]
default_mode = "summarize"  # "full", "summarize", "outline"
cache_enabled = true
cache_dir = ".slop_cache"
```

## 5. UI Changes

- Add "Clear Summary Cache" button in Files & Media or Context Composition
- Show cached status indicator on files (similar to AST cache indicator)
- Configuration in AI Settings or Project Settings

## 6. Acceptance Criteria

- [ ] File hash computed before summarization
- [ ] Summary cache persists across app restarts
- [ ] Sub-agent generates better summaries than basic skeleton
- [ ] Aggregation respects tier-level configuration
- [ ] Cache can be manually cleared
- [ ] Provider APIs handle aggregated context correctly

## 7. Out of Scope
- Changes to provider API internals
- Vector store / embeddings for RAG (separate track)
- Changes to Session Hub / Discussion Hub layout

## 8. Dependencies
- `aggregate.py` - main aggregation logic
- `file_cache.py` - AST parsing and caching
- `ai_client.py` - sub-agent invocation
- `models.py` - may need new config structures