Files
manual_slop/conductor/tracks/aggregation_smarter_summaries_20260322/spec.md
Ed_ abe1c660ea conductor(tracks): Add two deferred future tracks
- aggregation_smarter_summaries: Sub-agent summarization, hash-based caching
- system_context_exposure: Expose hidden _SYSTEM_PROMPT for user customization
2026-03-22 12:43:47 -04:00

3.5 KiB

Specification: Smarter Aggregation with Sub-Agent Summarization

1. Overview

This track improves the context aggregation system to use sub-agent passes for intelligent summarization and hash-based caching to avoid redundant work.

Current Problem:

  • Aggregation is a simple pass that either injects full file content or a basic skeleton
  • No intelligence applied to determine what level of detail is needed
  • Same files get re-summarized on every discussion start even if unchanged

Goal:

  • Use a sub-agent during aggregation pass for high-tier agents to generate succinct summaries
  • Cache summaries based on file hash - only re-summarize if file changed
  • Smart outline generation for code files, summary for text files

2. Current State Audit

Existing Aggregation Behavior

  • aggregate.py handles context aggregation
  • file_cache.py provides AST parsing and skeleton generation
  • Per-file flags: Auto-Aggregate (summarize), Force Full (inject raw)
  • No caching of summarization results

Provider API Considerations

  • Different providers have different prompt/caching mechanisms
  • Need to verify how each provider handles system context and caching
  • May need provider-specific aggregation strategies

3. Functional Requirements

3.1 Hash-Based Summary Cache

  • Generate SHA256 hash of file content
  • Store summaries in a cache (file-based or in project state)
  • Before summarizing, check if file hash matches cached summary
  • Cache invalidation when file content changes

3.2 Sub-Agent Summarization Pass

  • During aggregation, optionally invoke sub-agent for summarization
  • Sub-agent generates concise summary of file purpose and key points
  • Different strategies for:
    • Code files: AST-based outline + key function signatures
    • Text files: Paragraph-level summary
    • Config files: Key-value extraction

3.3 Tiered Aggregation Strategy

  • Tier 3/4 workers: Get skeleton outlines (fast, cheap)
  • Tier 2 (Tech Lead): Get summaries with key details
  • Tier 1 (Orchestrator): May get full content or enhanced summaries
  • Configurable per-agent via Persona

3.4 Cache Persistence

  • Summaries persist across sessions
  • Stored in project directory or centralized cache location
  • Manual cache clear option in UI

4. Data Model

4.1 Summary Cache Entry

{
    "file_path": str,
    "file_hash": str,  # SHA256 of content
    "summary": str,
    "outline": str,  # For code files
    "generated_at": str,  # ISO timestamp
    "generator_tier": str,  # Which tier generated it
}

4.2 Aggregation Config

[aggregation]
default_mode = "summarize"  # "full", "summarize", "outline"
cache_enabled = true
cache_dir = ".slop_cache"

5. UI Changes

  • Add "Clear Summary Cache" button in Files & Media or Context Composition
  • Show cached status indicator on files (similar to AST cache indicator)
  • Configuration in AI Settings or Project Settings

6. Acceptance Criteria

  • File hash computed before summarization
  • Summary cache persists across app restarts
  • Sub-agent generates better summaries than basic skeleton
  • Aggregation respects tier-level configuration
  • Cache can be manually cleared
  • Provider APIs handle aggregated context correctly

7. Out of Scope

  • Changes to provider API internals
  • Vector store / embeddings for RAG (separate track)
  • Changes to Session Hub / Discussion Hub layout

8. Dependencies

  • aggregate.py - main aggregation logic
  • file_cache.py - AST parsing and caching
  • ai_client.py - sub-agent invocation
  • models.py - may need new config structures