# Specification: Smarter Aggregation with Sub-Agent Summarization ## 1. Overview This track improves the context aggregation system to use sub-agent passes for intelligent summarization and hash-based caching to avoid redundant work. **Current Problem:** - Aggregation is a simple pass that either injects full file content or a basic skeleton - No intelligence applied to determine what level of detail is needed - Same files get re-summarized on every discussion start even if unchanged **Goal:** - Use a sub-agent during aggregation pass for high-tier agents to generate succinct summaries - Cache summaries based on file hash - only re-summarize if file changed - Smart outline generation for code files, summary for text files ## 2. Current State Audit ### Existing Aggregation Behavior - `aggregate.py` handles context aggregation - `file_cache.py` provides AST parsing and skeleton generation - Per-file flags: `Auto-Aggregate` (summarize), `Force Full` (inject raw) - No caching of summarization results ### Provider API Considerations - Different providers have different prompt/caching mechanisms - Need to verify how each provider handles system context and caching - May need provider-specific aggregation strategies ## 3. Functional Requirements ### 3.1 Hash-Based Summary Cache - Generate SHA256 hash of file content - Store summaries in a cache (file-based or in project state) - Before summarizing, check if file hash matches cached summary - Cache invalidation when file content changes ### 3.2 Sub-Agent Summarization Pass - During aggregation, optionally invoke sub-agent for summarization - Sub-agent generates concise summary of file purpose and key points - Different strategies for: - Code files: AST-based outline + key function signatures - Text files: Paragraph-level summary - Config files: Key-value extraction ### 3.3 Tiered Aggregation Strategy - Tier 3/4 workers: Get skeleton outlines (fast, cheap) - Tier 2 (Tech Lead): Get summaries with key details - Tier 1 (Orchestrator): May get full content or enhanced summaries - Configurable per-agent via Persona ### 3.4 Cache Persistence - Summaries persist across sessions - Stored in project directory or centralized cache location - Manual cache clear option in UI ## 4. Data Model ### 4.1 Summary Cache Entry ```python { "file_path": str, "file_hash": str, # SHA256 of content "summary": str, "outline": str, # For code files "generated_at": str, # ISO timestamp "generator_tier": str, # Which tier generated it } ``` ### 4.2 Aggregation Config ```toml [aggregation] default_mode = "summarize" # "full", "summarize", "outline" cache_enabled = true cache_dir = ".slop_cache" ``` ## 5. UI Changes - Add "Clear Summary Cache" button in Files & Media or Context Composition - Show cached status indicator on files (similar to AST cache indicator) - Configuration in AI Settings or Project Settings ## 6. Acceptance Criteria - [ ] File hash computed before summarization - [ ] Summary cache persists across app restarts - [ ] Sub-agent generates better summaries than basic skeleton - [ ] Aggregation respects tier-level configuration - [ ] Cache can be manually cleared - [ ] Provider APIs handle aggregated context correctly ## 7. Out of Scope - Changes to provider API internals - Vector store / embeddings for RAG (separate track) - Changes to Session Hub / Discussion Hub layout ## 8. Dependencies - `aggregate.py` - main aggregation logic - `file_cache.py` - AST parsing and caching - `ai_client.py` - sub-agent invocation - `models.py` - may need new config structures