Private

Public Access

Files

T

ed 35c6cca134 docs: agent workflow docs + regular docs (v2.3 surfacing)

Per user request 'use your remaining context to update agent workflow
docs and then regular docs based on what was discussed in this report',
this commit creates/updates 15 files derived from the v2.3 nagent
review (the 12 new nagent additions + the 4 memory dimensions
reframing + the cache strategy + the RAG discipline + the knowledge
harvest pattern).

Agent workflow docs (4 files):
- AGENTS.md (UPDATE): add @import line to canonical DOD + 'Code
  Styleguides' section pointing to the 6 new styleguides + new
  'Human-Facing Documentation' section pointing to ./docs/AGENTS.md
- conductor/workflow.md (UPDATE): new section 'Additions (2026-06-12)
  - the 12 patterns from the latest nagent corpus' with TDD
  protocols for knowledge harvest, cache ordering, compaction, RAG
  discipline
- conductor/product-guidelines.md (UPDATE): new sections 'Memory
  Dimensions (added 2026-06-12)' + 'See Also - Updated' with the
  6-styleguide catalog
- docs/AGENTS.md (NEW): the agent-facing mirror of docs/Readme.md
  (per the nagent CLAUDE.md pattern). 10 sections + the per-tier
  reading path + the 4 memory dimensions + the caching strategy +
  the knowledge harvest + the RAG discipline + the feature flags

Regular docs (11 files):
- 6 new styleguides (the convention catalog):
  * data_oriented_design.md: the canonical DOD reference (Tier
    0/1/2; 3 defaults to reject; 8 core defaults; 7-question
    simplification pass; 10-question self-check; 4 memory
    dimensions in Manual Slop context)
  * agent_memory_dimensions.md: the 4 memory dims (curation /
    discussion / RAG / knowledge) + when to use each + the
    boundaries
  * rag_integration_discipline.md: the conservative-RAG rule
    (opt-in, complement, provenance, no mutation, feature-gated,
    graceful failure)
  * cache_friendly_context.md: stable-to-volatile context
    ordering + the cache TTL GUI contract + the byte-comparison
    test
  * knowledge_artifacts.md: the knowledge harvest pattern
    (category files, provenance, sha256 ledger, digest
    regeneration, 'delete to turn off')
  * feature_flags.md: file presence vs config flags vs CLI flags
- 3 new project docs (the cross-cutting guides):
  * guide_agent_memory_dimensions.md: the cross-cutting guide on
    the 4 dims + the decision tree
  * guide_caching_strategy.md: caching across providers +
    stable-to-volatile ordering + cache TTL GUI + the byte-
    comparison test + the 5th provider (claude-code)
  * guide_knowledge_curation.md: the knowledge memory guide (4th
    dim) + the 5 category files + per-file notes + the digest +
    the ledger + the harvest workflow
- 2 existing doc updates:
  * guide_mma.md: new sections 'Delegation as context management'
    + 'The 4 memory dimensions (the MMA scope)'
  * guide_ai_client.md: new section 'Cache strategy and the 12-
    layer model' + the 5th provider (claude-code)

All files use the same style as the v2.3 review (the user's preferred
format): 7-column tables, no JSON, SSDL shape tags, forth/array
notation, file:line citations, ASCII sketches where useful. The
human Readme files (Readme.md, docs/Readme.md) are NOT modified
(per repeated user instruction).

The 5th provider (claude-code) is documented in guide_ai_client.md
+ the data_oriented_design.md references the nagent pattern as the
source of the canonical rules.

The cross-references are bidirectional: the 6 styleguides reference
the 3 project docs; the 3 project docs reference the 6 styleguides;
the 2 doc updates reference both; AGENTS.md + ./docs/AGENTS.md
provide the entry points.

2026-06-12 13:50:40 -04:00

16 KiB

Raw Blame History

Product Guidelines: Manual Slop

Documentation Style

Strict & In-Depth: Documentation must follow an old-school, highly detailed technical breakdown style (similar to VEFontCache-Odin). Focus on architectural design, state management, algorithmic details, and structural formats rather than just surface-level usage.

UX & UI Principles

USA Graphics Company Values: Embrace high information density and tactile interactions.
Professional Arcade Aesthetics: Balances high-energy "Arcade" feedback (blinking notifications, tactile updates) with a "Professional" visual discipline. Employs modern typography (Inter/Maple Mono), subtle rounded geometry, and soft shadows to ensure the tool feels like a sophisticated, expert utility. Includes a high-density NERV Technical Console theme option for maximum focus and CRT-inspired visual feedback.
Rich Text Readability: Prioritizes legibility of AI communications and technical logs by utilizing GitHub-Flavored Markdown and integrated syntax highlighting. This ensures that complex code fragments and structured data are immediately accessible and professionally presented.
Explicit Control & Expert Focus: The interface should not hold the user's hand. It must prioritize explicit manual confirmation for destructive actions while providing dense, unadulterated access to logs and context.
Multi-Viewport Capabilities: Leverage dockable, floatable panels to allow users to build custom workspaces suitable for multi-monitor setups.

Code Standards & Architecture

Data-Oriented & Immediate Mode Heuristics: Align with the architectural values of engineers like Casey Muratori and Mike Acton.
- The "Less Python Does, the Better" Rule: Python should act primarily as a procedural semantic definer (similar to how ImGui defines a UI DAG), delegating heavy lifting to efficient data structures, vectorized operations, or lower-level primitives.
- Minimize Python JIT overhead by favoring bulk data processing over fine-grained object-oriented manipulation.
- The GUI (gui_2.py) must remain a pure visualization of application state. It should not own complex business logic or orchestrator hooks (strive to decouple the 'Application' controller from the 'View').
- Treat the UI as an immediate mode frame-by-frame projection of underlying data structures.
- Optimize for zero lag and never block the main render loop with heavy Python JIT work.
- Utilize proper asynchronous batching and queue-based pipelines for background AI work, ensuring a data-oriented flow rather than tangled object-oriented state graphs.
Strict State Management: There must be a rigorous separation between the Main GUI rendering thread and daemon execution threads. The UI should never hang during AI communication or script execution. Use lock-protected queues and events for synchronization.
Comprehensive Logging: Aggressively log all actions, API payloads, tool calls, and executed scripts. Maintain timestamped JSON-L and markdown logs to ensure total transparency and debuggability.
Mandatory ImGui Verification: All changes to the GUI (gui_2.py) MUST be verified using the custom AST linter (scripts/check_imgui_scopes.py) to ensure all ImGui scopes (begin/end, push/pop) are properly matched. Developers should prioritize the use of src/imgui_scopes.py context managers (imscope) over manual push/pop calls.
Modular Controller Pattern: To prevent "God Object" bloat in core controllers (like AppController), all state-independent or utility logic must be moved to module-level functions. Functions requiring class state should accept the instance as an explicit dependency (def logic(controller: AppController, ...)). Massive if/elif dispatch blocks must be refactored into handler maps (dictionaries) of module-level functions.
UI Delegation for Hot-Reload: All complex ImGui rendering logic must be extracted from the App class into module-level functions named render_xxx(app: App). The App class should only contain thin delegation wrappers (def _render_xxx(self): render_xxx(self)). This architecture is mandatory for supporting state-preserving hot-reloads of the UI logic.
Dependency Minimalism: Limit external dependencies where possible. For instance, prefer standard library modules (like urllib and html.parser for web tools) over heavy third-party packages.

Phase 5: Heavy Curation & Structural Integrity (MANDATORY)

Intensive System Analysis: Align with the standards of low-level systems engineers (Fleury, Acton, Muratori, Blow). Do not accept high-level abstractions as sufficient documentation.
Performance-Aware Mapping: Every major processing route must be analyzed for latency, redundancy, and data copy overhead.
Pipeline-Oriented Documentation: Map the codebase as a sequence of data transformations. Identify exactly where data enters, how it is mutated, and where it exits.
Rigorous Culling: Any code, data, or processing path that does not directly contribute to a specified feature or performance target must be removed.
Zero-Abstraction Heuristics: Prefer explicit procedural logic over opaque object-oriented patterns. Ensure state transitions are traceable and deterministic.

AI-Optimized Compact Style

Indentation: Exactly 1 space per level. This minimizes token usage in nested structures.
Newlines: Maximum one (1) blank line between top-level definitions. Zero (0) blank lines within function or method bodies.
Vertical Compaction: Use single-line if statements, semicolon-separated framework calls (imgui.same_line(); imgui.text(...)), and aligned assignments to aggressively minimize vertical line counts. Note: Function and method definition signatures (def ...:) must ALWAYS remain on their own isolated lines.
Region Blocks: Use #region: Name and #endregion: Name to logically organize massive files that cannot be easily broken apart without increasing context load.
Type Hinting: Mandatory, strict type hints for all parameters, return types, and global variables to ensure high-signal context for AI agents.
Structural Dependency Mapping (SDM): All major state variables, methods, and functions MUST include terse dependency tags at the end of their docstrings for AI-assisted impact analysis.
- Functions/Methods: [C: Caller1, Caller2] (Primary callers).
- State Variables: [M: File:Line, Method] (Mutation points) and [U: File] (Major use paths).

Data-Oriented Error Handling

The codebase follows the "errors are just cases" framework from Ryan Fleury's The Easiest Way To Handle Errors. The canonical reference (with code examples) is in conductor/code_styleguides/error_handling.md. Key principles:

Result dataclasses instead of Optional[T] or exception-based control flow.
Nil-sentinel dataclasses instead of None.
Zero-initialized fields via @dataclass defaults.
Fail early: validation at the entry point, not deep in the call stack.
AND over OR: return a struct with data + side-channel errors, not a sum type.
Exceptions reserved for the SDK boundary: SDK errors are caught and converted to ErrorInfo dataclasses; the rest of the application works with data, not control flow.

This convention is established incrementally. The 2026-06-11 data_oriented_error_handling_20260606 track applies it to src/mcp_client.py, src/ai_client.py, and src/rag_engine.py. Future tracks will apply it to the remaining src/ files (src/app_controller.py, src/models.py, src/project_manager.py, etc. — see conductor/tracks/data_oriented_error_handling_20260606/spec.md §12.2 for the prioritized list).

`Optional[T]` ban (return types only)

In the 3 refactored files (src/mcp_client.py, src/ai_client.py, src/rag_engine.py), Optional[T] return types are forbidden. Use Result[T] (with a NIL_T singleton if needed) instead. Argument types that may be None (e.g., rag_engine: Optional[Any] = None) remain allowed — they describe a caller choice, not a runtime failure of this function. The audit script scripts/audit_optional_in_3_files.py enforces this rule by failing CI on new Optional[X] return types in the 3 refactored files.

Public API deprecation: `ai_client.send()` → `ai_client.send_result()`

The public ai_client.send() is marked @deprecated (via typing_extensions.deprecated). It still works for backward compat but emits a DeprecationWarning at runtime. New code MUST use ai_client.send_result(), which returns Result[str, ErrorInfo] instead of str. Removal is planned in the follow-up public_api_migration_20260606 track.

</new_content>

Testing Requirements

These are the process standards the project's test infrastructure enforces. For the full implementation contract (fixture names, anti-patterns, audit scripts), see docs/guide_testing.md §Structural Testing Contract and the per-styleguide audit scripts in code_styleguides/.

Structural Testing Contract: Ban on arbitrary core mocking with unittest.mock.patch (unless explicitly authorized for a specific boundary test). All integration and end-to-end testing must use the live_gui fixture to interact with a real instance of the application via the Hook API. Bypassing the hook server to directly mutate GUI state in tests is prohibited. All test-generated artifacts (logs, temporary workspaces, mock outputs) MUST be written to tests/artifacts/ or tests/logs/ (gitignored).
Isolated-Pass Verification Fallacy (Added 2026-06-10): A test that "passes when run after test X but fails in isolation" is a fragile test, not a fragile fixture. The flip side is also true: a test that "passes in isolation but fails in batch" is failing — its failure is masked by isolation. The only verification that matters for live_gui tests (or any test that depends on shared subprocess state) is the batch run in the suite the test will ship in. Do NOT commit a fix that has only been verified in isolation. The 4-day test-hell saga of 2026-06-06 to 2026-06-10 was the result of agents committing fixes after isolated passes; the bisect required both directions and was only caught at the suite-level batch green on 2026-06-10. See docs/reports/test_infrastructure_hardening_batch_green_20260610.md for the full incident.
Audit Scripts as CI Gates: The 4 audit scripts (check_test_toml_paths.py, audit_main_thread_imports.py, audit_weak_types.py, audit_no_models_config_io.py) enforce the conventions above. They run as pre-commit/CI gates and exit non-zero on regression. New conventions must be paired with a new audit script per conductor/workflow.md §Audit Script Policy.
Skip Markers Are Documentation, Not Avoidance: @pytest.mark.skip(reason=...) is a record of a known failure, not an escape from fixing the underlying bug. Skip markers are valid for opt-in integration tests (require external resources, env-var-gated) or features behind a feature flag. They are NOT valid for pre-existing failing tests, tests the agent doesn't understand, or racy assertions the agent doesn't want to debug. When you add a skip, document the underlying issue in reason= and commit with a follow-up note. See conductor/workflow.md §Skip-Marker Policy.

Memory Dimensions (added 2026-06-12)

The conversation data has 4 distinct memory dimensions. Features touch 1-2 typically; some touch 3. The dimensions are not interchangeable.

Dim	Where	What it stores	User-editable	Status
Curation	`FileItem` + `ContextPreset`	How to render a file	Structural File Editor	Existing, strong
Discussion	`disc_entries` + branching + UISnapshot	What was said	GUI `[Edit]` mode; undo/redo	Existing, strong
RAG	`src/rag_engine.py` (ChromaDB)	Semantic fingerprints	(opaque)	Opt-in
Knowledge	`~/.manual_slop/knowledge/*.md` + per-file + digest	Durable learnings	Plain markdown	Proposed (Candidate 8)

The product decision. When scoping a new feature, identify which dimension(s) the feature touches. Pick the matching dimension; don't reach for the wrong shape. The full cross-cutting guide is docs/guide_agent_memory_dimensions.md. The canonical styleguide is conductor/code_styleguides/agent_memory_dimensions.md.

The 6 design rules (the product implications).

Curation is structural. Per-file schema; AST-aware; user-edited. Not conversational.
Discussion is conversational. Per-discussion, multi-turn. Not per-file. Not semantic.
RAG is opt-in, fuzzy, semantic. Default-off in new projects. Complements; never replaces. Provenance required. No mutation.
Knowledge is durable, user-editable, provenance-aware. The category files are the source of truth; the digest is a projection. "Delete to turn off": rm digest.md.
Cache hits only on the stable prefix (layers 1-7 of the 12-layer model). The volatile suffix (layers 8-12) is never cached.
Feature flags are data, not config. File presence ("delete to turn off") for side artifacts; config flags for persistent preferences; CLI flags for one-shot overrides.

16 KiB

Raw Blame History

Product Guidelines: Manual Slop

Documentation Style

UX & UI Principles

Code Standards & Architecture

Phase 5: Heavy Curation & Structural Integrity (MANDATORY)

AI-Optimized Compact Style

Data-Oriented Error Handling

`Optional[T]` ban (return types only)

Public API deprecation: `ai_client.send()` → `ai_client.send_result()`

Testing Requirements

See Also — Applied Conventions

Memory Dimensions (added 2026-06-12)

See Also — Updated (2026-06-12)

16 KiB Raw Blame History

Product Guidelines: Manual Slop

Documentation Style

UX & UI Principles

Code Standards & Architecture

Phase 5: Heavy Curation & Structural Integrity (MANDATORY)

AI-Optimized Compact Style

Data-Oriented Error Handling

Optional[T] ban (return types only)

Public API deprecation: ai_client.send() → ai_client.send_result()

Testing Requirements

See Also — Applied Conventions

Memory Dimensions (added 2026-06-12)

See Also — Updated (2026-06-12)

16 KiB

Raw Blame History

`Optional[T]` ban (return types only)

Public API deprecation: `ai_client.send()` → `ai_client.send_result()`