ed/manual_slop

Private

Public Access

Fork 0

Files

T

ed 734840375f docs(guidelines): add AI Agent Obligations section with 4 enforcement audit scripts

2026-06-16 10:35:55 -04:00

19 KiB

Raw Blame History

Product Guidelines: Manual Slop

Documentation Style

Strict & In-Depth: Documentation must follow an old-school, highly detailed technical breakdown style (similar to VEFontCache-Odin). Focus on architectural design, state management, algorithmic details, and structural formats rather than just surface-level usage.

UX & UI Principles

USA Graphics Company Values: Embrace high information density and tactile interactions.
Professional Arcade Aesthetics: Balances high-energy "Arcade" feedback (blinking notifications, tactile updates) with a "Professional" visual discipline. Employs modern typography (Inter/Maple Mono), subtle rounded geometry, and soft shadows to ensure the tool feels like a sophisticated, expert utility. Includes a high-density NERV Technical Console theme option for maximum focus and CRT-inspired visual feedback.
Rich Text Readability: Prioritizes legibility of AI communications and technical logs by utilizing GitHub-Flavored Markdown and integrated syntax highlighting. This ensures that complex code fragments and structured data are immediately accessible and professionally presented.
Explicit Control & Expert Focus: The interface should not hold the user's hand. It must prioritize explicit manual confirmation for destructive actions while providing dense, unadulterated access to logs and context.
Multi-Viewport Capabilities: Leverage dockable, floatable panels to allow users to build custom workspaces suitable for multi-monitor setups.

Code Standards & Architecture

Data-Oriented & Immediate Mode Heuristics: Align with the architectural values of engineers like Casey Muratori and Mike Acton.
- The "Less Python Does, the Better" Rule: Python should act primarily as a procedural semantic definer (similar to how ImGui defines a UI DAG), delegating heavy lifting to efficient data structures, vectorized operations, or lower-level primitives.
- Minimize Python JIT overhead by favoring bulk data processing over fine-grained object-oriented manipulation.
- The GUI (gui_2.py) must remain a pure visualization of application state. It should not own complex business logic or orchestrator hooks (strive to decouple the 'Application' controller from the 'View').
- Treat the UI as an immediate mode frame-by-frame projection of underlying data structures.
- Optimize for zero lag and never block the main render loop with heavy Python JIT work.
- Utilize proper asynchronous batching and queue-based pipelines for background AI work, ensuring a data-oriented flow rather than tangled object-oriented state graphs.
Strict State Management: There must be a rigorous separation between the Main GUI rendering thread and daemon execution threads. The UI should never hang during AI communication or script execution. Use lock-protected queues and events for synchronization.
Comprehensive Logging: Aggressively log all actions, API payloads, tool calls, and executed scripts. Maintain timestamped JSON-L and markdown logs to ensure total transparency and debuggability.
Mandatory ImGui Verification: All changes to the GUI (gui_2.py) MUST be verified using the custom AST linter (scripts/check_imgui_scopes.py) to ensure all ImGui scopes (begin/end, push/pop) are properly matched. Developers should prioritize the use of src/imgui_scopes.py context managers (imscope) over manual push/pop calls.
Modular Controller Pattern: To prevent "God Object" bloat in core controllers (like AppController), all state-independent or utility logic must be moved to module-level functions. Functions requiring class state should accept the instance as an explicit dependency (def logic(controller: AppController, ...)). Massive if/elif dispatch blocks must be refactored into handler maps (dictionaries) of module-level functions.
UI Delegation for Hot-Reload: All complex ImGui rendering logic must be extracted from the App class into module-level functions named render_xxx(app: App). The App class should only contain thin delegation wrappers (def _render_xxx(self): render_xxx(self)). This architecture is mandatory for supporting state-preserving hot-reloads of the UI logic.
Dependency Minimalism: Limit external dependencies where possible. For instance, prefer standard library modules (like urllib and html.parser for web tools) over heavy third-party packages.

Phase 5: Heavy Curation & Structural Integrity (MANDATORY)

Intensive System Analysis: Align with the standards of low-level systems engineers (Fleury, Acton, Muratori, Blow). Do not accept high-level abstractions as sufficient documentation.
Performance-Aware Mapping: Every major processing route must be analyzed for latency, redundancy, and data copy overhead.
Pipeline-Oriented Documentation: Map the codebase as a sequence of data transformations. Identify exactly where data enters, how it is mutated, and where it exits.
Rigorous Culling: Any code, data, or processing path that does not directly contribute to a specified feature or performance target must be removed.
Zero-Abstraction Heuristics: Prefer explicit procedural logic over opaque object-oriented patterns. Ensure state transitions are traceable and deterministic.

AI-Optimized Compact Style

Indentation: Exactly 1 space per level. This minimizes token usage in nested structures.
Newlines: Maximum one (1) blank line between top-level definitions. Zero (0) blank lines within function or method bodies.
Vertical Compaction: Use single-line if statements, semicolon-separated framework calls (imgui.same_line(); imgui.text(...)), and aligned assignments to aggressively minimize vertical line counts. Note: Function and method definition signatures (def ...:) must ALWAYS remain on their own isolated lines.
Region Blocks: Use #region: Name and #endregion: Name to logically organize massive files that cannot be easily broken apart without increasing context load.
Type Hinting: Mandatory, strict type hints for all parameters, return types, and global variables to ensure high-signal context for AI agents.
Structural Dependency Mapping (SDM): All major state variables, methods, and functions MUST include terse dependency tags at the end of their docstrings for AI-assisted impact analysis.
- Functions/Methods: [C: Caller1, Caller2] (Primary callers).
- State Variables: [M: File:Line, Method] (Mutation points) and [U: File] (Major use paths).

Data-Oriented Error Handling

The codebase follows the "errors are just cases" framework from Ryan Fleury's The Easiest Way To Handle Errors. The canonical reference (with code examples) is in conductor/code_styleguides/error_handling.md. Key principles:

Result dataclasses instead of Optional[T] or exception-based control flow.
Nil-sentinel dataclasses instead of None.
Zero-initialized fields via @dataclass defaults.
Fail early: validation at the entry point, not deep in the call stack.
AND over OR: return a struct with data + side-channel errors, not a sum type.
Exceptions reserved for the SDK boundary: SDK errors are caught and converted to ErrorInfo dataclasses; the rest of the application works with data, not control flow.

This convention is established incrementally. The 2026-06-11 data_oriented_error_handling_20260606 track applies it to src/mcp_client.py, src/ai_client.py, and src/rag_engine.py. Future tracks will apply it to the remaining src/ files (src/app_controller.py, src/models.py, src/project_manager.py, etc. — see conductor/tracks/data_oriented_error_handling_20260606/spec.md §12.2 for the prioritized list).

Audit: the convention is enforced via scripts/audit_exception_handling.py (static analyzer; file-presence = enabled per feature_flags.md). Run uv run python scripts/audit_exception_handling.py for a human-readable report or --json for machine-readable output. The audit classifies each try/except/finally/raise site against 10 categories (5 compliant + 3 violation + 1 suspicious + 1 unclear); see the styleguide's "Audit Script" section for the full taxonomy.

AI Agent Obligations (Added 2026-06-16)

AI agents writing code in this codebase MUST follow the data-oriented convention. The convention is the OPPOSITE of idiomatic Python; LLMs are trained on idiomatic Python and will revert to it without explicit guidance. The project enforces the convention through 4 mechanisms:

conductor/code_styleguides/error_handling.md — the canonical styleguide. Has 5 patterns, 3 boundary types, 1 broad-except distinction rule, 1 constructor-raise rule, 1 re-raise rule, and the audit script reference. Read this before writing any code that can fail at runtime.
conductor/code_styleguides/error_handling.md "AI Agent Checklist" — the explicit cheatsheet of 5 MUST-DO rules, 7 MUST-NOT-DO rules, and 3 boundary patterns. Run this checklist before claiming a task is done.
scripts/audit_exception_handling.py — the static analyzer that catches violations before commit. The script classifies try/except/finally/raise sites against 10 categories. Use it pre-commit.
scripts/audit_exception_handling.py --strict — the CI gate. Exits 1 on any violation. Wire this into pre-commit hooks and CI.

The 4 enforcement audit scripts (the project-level enforcement set):

Script	Purpose	Default mode
`audit_exception_handling.py`	Classifies `try/except/finally/raise` sites per the data-oriented convention	Informational (exits 0)
`audit_exception_handling.py --strict`	CI gate: exits 1 on any violation	CI gate (exits 1)
`audit_weak_types.py`	Identifies `dict[str, Any]` / `list[dict[...]]` / `Optional[Tuple]` / etc.	Informational (exits 0)
`audit_weak_types.py --strict`	CI gate for the type-strengthening convention	CI gate (exits 1)
`audit_main_thread_imports.py`	Enforces the main-thread import graph purity invariant	Always strict (exits 1)
`audit_no_models_config_io.py`	Enforces config-I/O ownership (AppController is the single source of truth)	Always strict (exits 1)

Pre-commit workflow (recommended):

# Run before claiming "done"
uv run python scripts/audit_exception_handling.py
uv run python scripts/audit_weak_types.py
uv run python scripts/audit_main_thread_imports.py
uv run python scripts/audit_no_models_config_io.py

# In CI / pre-commit hook (exits 1 on any violation)
uv run python scripts/audit_exception_handling.py --strict
uv run python scripts/audit_weak_types.py --strict

Why this is enforced: the convention prevents "tech rot with idiomatic Python." LLMs writing new code in this codebase will revert to idiomatic patterns (try/except, Optional[T], raise Exception) without explicit guidance. The 4 enforcement mechanisms (styleguide + checklist + audit script + CI gate) are the defense-in-depth. See docs/AGENTS.md §"Convention Enforcement" for the project-level rules and AGENTS.md "Critical Anti-Patterns" for the HARD BAN entries.

`Optional[T]` ban (return types only)

In the 3 refactored files (src/mcp_client.py, src/ai_client.py, src/rag_engine.py), Optional[T] return types are forbidden. Use Result[T] (with a NIL_T singleton if needed) instead. Argument types that may be None (e.g., rag_engine: Optional[Any] = None) remain allowed — they describe a caller choice, not a runtime failure of this function. The audit script scripts/audit_optional_in_3_files.py enforces this rule by failing CI on new Optional[X] return types in the 3 refactored files.

Public API: `ai_client.send_result()` (RESOLVED 2026-06-15)

The public ai_client.send_result() is the canonical public API. It returns Result[str, ErrorInfo]. The legacy ai_client.send() was removed in the public_api_migration_and_ui_polish_20260615 track on 2026-06-15 (see conductor/tracks/public_api_migration_and_ui_polish_20260615/spec.md). All production call sites and tests now use send_result().

</new_content>

Testing Requirements

These are the process standards the project's test infrastructure enforces. For the full implementation contract (fixture names, anti-patterns, audit scripts), see docs/guide_testing.md §Structural Testing Contract and the per-styleguide audit scripts in code_styleguides/.

Structural Testing Contract: Ban on arbitrary core mocking with unittest.mock.patch (unless explicitly authorized for a specific boundary test). All integration and end-to-end testing must use the live_gui fixture to interact with a real instance of the application via the Hook API. Bypassing the hook server to directly mutate GUI state in tests is prohibited. All test-generated artifacts (logs, temporary workspaces, mock outputs) MUST be written to tests/artifacts/ or tests/logs/ (gitignored).
Isolated-Pass Verification Fallacy (Added 2026-06-10): A test that "passes when run after test X but fails in isolation" is a fragile test, not a fragile fixture. The flip side is also true: a test that "passes in isolation but fails in batch" is failing — its failure is masked by isolation. The only verification that matters for live_gui tests (or any test that depends on shared subprocess state) is the batch run in the suite the test will ship in. Do NOT commit a fix that has only been verified in isolation. The 4-day test-hell saga of 2026-06-06 to 2026-06-10 was the result of agents committing fixes after isolated passes; the bisect required both directions and was only caught at the suite-level batch green on 2026-06-10. See docs/reports/test_infrastructure_hardening_batch_green_20260610.md for the full incident.
Audit Scripts as CI Gates: The 4 audit scripts (check_test_toml_paths.py, audit_main_thread_imports.py, audit_weak_types.py, audit_no_models_config_io.py) enforce the conventions above. They run as pre-commit/CI gates and exit non-zero on regression. New conventions must be paired with a new audit script per conductor/workflow.md §Audit Script Policy.
Skip Markers Are Documentation, Not Avoidance: @pytest.mark.skip(reason=...) is a record of a known failure, not an escape from fixing the underlying bug. Skip markers are valid for opt-in integration tests (require external resources, env-var-gated) or features behind a feature flag. They are NOT valid for pre-existing failing tests, tests the agent doesn't understand, or racy assertions the agent doesn't want to debug. When you add a skip, document the underlying issue in reason= and commit with a follow-up note. See conductor/workflow.md §Skip-Marker Policy.

Memory Dimensions (added 2026-06-12)

The conversation data has 4 distinct memory dimensions (curation / discussion / RAG / knowledge). Features touch 1-2 typically; some touch 3. The dimensions are not interchangeable.

The full canonical 4-dim table is in conductor/code_styleguides/agent_memory_dimensions.md §0 (with the SSDL shape tag per dim + per-dim deep-dives + the decision tree). This section is the product-level summary.

The one-line summary: curation is per-file structural; discussion is per-turn conversational; RAG is opt-in semantic; knowledge is per-project durable. Pick the matching dimension; don't reach for the wrong shape.

The cross-cutting guide is docs/guide_agent_memory_dimensions.md. The canonical styleguide is conductor/code_styleguides/agent_memory_dimensions.md.

The 6 design rules (the product implications).

Curation is structural. Per-file schema; AST-aware; user-edited. Not conversational.
Discussion is conversational. Per-discussion, multi-turn. Not per-file. Not semantic.
RAG is opt-in, fuzzy, semantic. Default-off in new projects. Complements; never replaces. Provenance required. No mutation.
Knowledge is durable, user-editable, provenance-aware. The category files are the source of truth; the digest is a projection. "Delete to turn off": rm digest.md.
Cache hits only on the stable prefix (layers 1-7 of the 12-layer model). The volatile suffix (layers 8-12) is never cached.
Feature flags are data, not config. File presence ("delete to turn off") for side artifacts; config flags for persistent preferences; CLI flags for one-shot overrides.

19 KiB

Raw Blame History

Product Guidelines: Manual Slop

Documentation Style

UX & UI Principles

Code Standards & Architecture

Phase 5: Heavy Curation & Structural Integrity (MANDATORY)

AI-Optimized Compact Style

Data-Oriented Error Handling

AI Agent Obligations (Added 2026-06-16)

`Optional[T]` ban (return types only)

Public API: `ai_client.send_result()` (RESOLVED 2026-06-15)

Testing Requirements

See Also — Applied Conventions

Memory Dimensions (added 2026-06-12)

See Also — Updated (2026-06-12)

19 KiB Raw Blame History

Product Guidelines: Manual Slop

Documentation Style

UX & UI Principles

Code Standards & Architecture

Phase 5: Heavy Curation & Structural Integrity (MANDATORY)

AI-Optimized Compact Style

Data-Oriented Error Handling

AI Agent Obligations (Added 2026-06-16)

Optional[T] ban (return types only)

Public API: ai_client.send_result() (RESOLVED 2026-06-15)

Testing Requirements

See Also — Applied Conventions

Memory Dimensions (added 2026-06-12)

See Also — Updated (2026-06-12)

19 KiB

Raw Blame History

`Optional[T]` ban (return types only)

Public API: `ai_client.send_result()` (RESOLVED 2026-06-15)