conductor(track-update): data_oriented_error_handling - nagent_review + docs refresh
3 surgical additions to the spec, no task changes: 1. New ErrorKind: Added PROVIDER_HISTORY_DIVERGED_FROM_UI to the ErrorKind enum. Per nagent_review Pitfall #4 (provider history divergence: user edits disc_entries[i].content via the discussion UI but ai_client._<provider>_history still replays the original). The new kind makes the divergence *detectable* and *reportable* so the follow-up public_api_migration_20260606 track can collapse the two history layers. The Result pattern from this track is the natural carrier for the signal. 2. State-delegation regression tests: Added mandatory regression tests to the testing strategy in §6 for the ai_client refactor (highest-risk phase). The new tests exercise: - app.temperature = 0.5 round-trips through App.__getattr__/ __setattr__ delegation (per gui_2.py:666-675) - controller.disc_entries[i].content is reflected in the next send_result()'s messages parameter - The 3 per-provider history locks serialize correctly under concurrent send_result() calls The reason this is mandatory: per guide_state_lifecycle.md (added 2026-06-08), the App.__getattr__/__setattr__ pattern means a partial refactor manifests as silent AttributeError deep in test code, not at the refactor commit boundary. 3. See Also cross-references: Added 6 new entries to §12.3: - docs/guide_ai_client.md (per-provider history globals) - docs/guide_mcp_client.md (3-layer security model) - docs/guide_state_lifecycle.md (3 per-thread + 7-lock pattern) - docs/guide_discussions.md (23-operation matrix) - docs/guide_context_aggregation.md (build_discussion_section) - conductor/tracks/mcp_architecture_refactor_20260606/ - conductor/tracks/nagent_review_20260608/{report,takeaways}.md No plan.md changes. Plan tasks are task-level and will flow from the spec changes when the track is re-planned.
This commit is contained in:
@@ -115,6 +115,16 @@ class ErrorKind(str, Enum):
|
||||
UNKNOWN = "unknown"
|
||||
CONFIG = "config"
|
||||
INTERNAL = "internal"
|
||||
# Added 2026-06-08 per nagent_review Pitfall #4 (provider history divergence).
|
||||
# The Application edits the entry's content (e.g., user fixes a typo in an AI
|
||||
# response, or branches at a midpoint via guide_discussions.md §"Per-Entry
|
||||
# Operations" A1+A4) but the ai_client._<provider>_history (the bytes
|
||||
# actually replayed to the LLM) still contains the original. This is
|
||||
# silent corruption, not a thrown error. The PROVIDER_HISTORY_DIVERGED_FROM_UI
|
||||
# kind makes the divergence *detectable* and *reportable* so the follow-up
|
||||
# public_api_migration_20260606 track can collapse the two history layers
|
||||
# (see §12.1).
|
||||
PROVIDER_HISTORY_DIVERGED_FROM_UI = "provider_history_diverged_from_ui"
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class ErrorInfo:
|
||||
@@ -479,7 +489,7 @@ All existing configs (`config.toml`, `credentials.toml`, per-project TOML) work
|
||||
|---|---|---|
|
||||
| `tests/test_result_types.py` | `Result`, `ErrorInfo`, nil-sentinel singletons. | 100% |
|
||||
| `tests/test_mcp_client_paths.py` | Verify `_resolve_and_check` returns `Result` (not tuple); verify `read_file` returns `Result[str]`. | 90% (covers the new code paths; existing tests still pass) |
|
||||
| `tests/test_ai_client_result.py` | Verify `_send_<vendor>_result()` returns `Result`; verify `send_result()` is the new public API; verify `send()` emits `DeprecationWarning`. | 90% |
|
||||
| `tests/test_ai_client_result.py` | Verify `_send_<vendor>_result()` returns `Result`; verify `send_result()` is the new public API; verify `send()` emits `DeprecationWarning`. **State-delegation regression tests (added 2026-06-08 per `docs/guide_state_lifecycle.md` and the 2026-06-08 docs refresh):** verify that `app.temperature = 0.5` round-trips through the `App.__getattr__`/`__setattr__` delegation (per `gui_2.py:666-675`) and is visible in the next `send_result()` call; verify that `controller.disc_entries[i].content = "..."` is reflected in the next `send_result()`'s `messages` parameter (this is the regression vector for nagent_review Pitfall #4, the provider-history divergence); verify that the 3 per-provider history locks (`_anthropic_history_lock`, `_deepseek_history_lock`, `_minimax_history_lock` per `ai_client.py:124,128,132`) serialize correctly under concurrent `send_result()` calls from different threads. These tests are *mandatory* for Phase 3 (the ai_client refactor) because the `App.__getattr__`/`__setattr__` delegation means a partial refactor would manifest as silent `AttributeError`s deep in the test, not at the refactor commit boundary. | 90% |
|
||||
| `tests/test_rag_engine_result.py` | Verify RAG methods return `Result`; verify `NilRAGState` is used. | 80% |
|
||||
| `tests/test_deprecation_warnings.py` | Verify `ai_client.send()` emits exactly one `DeprecationWarning` per call site (cached after first). | 100% |
|
||||
| `tests/test_mcp_client.py` (existing) | Verify no regressions; existing tests pass unchanged. | 100% (regression) |
|
||||
@@ -683,10 +693,15 @@ Plus ~50+ test files that call `send()` directly. The follow-up track's `rg "ai_
|
||||
|
||||
### 12.3 Project References
|
||||
|
||||
- `docs/guide_ai_client.md` — current provider architecture; will be updated in Phase 5.
|
||||
- `docs/guide_mcp_client.md` — current MCP client architecture; will be updated in Phase 5.
|
||||
- `conductor/product-guidelines.md` "Modular Controller Pattern" — the convention this track extends (Data-Oriented Error Handling is a new top-level convention in the same family).
|
||||
- `conductor/tracks/qwen_llama_grok_integration_20260606/` — the previous track that introduced the "data-oriented" framing; this track extends that philosophy to error handling.
|
||||
- `docs/guide_ai_client.md` — current provider architecture; will be updated in Phase 5. The per-provider history globals (`_anthropic_history`, `_deepseek_history`, `_minimax_history` at `ai_client.py:123-132`) are the **specific pattern** that the `ErrorKind.PROVIDER_HISTORY_DIVERGED_FROM_UI` new error kind (added 2026-06-08) is designed to surface. Per `guide_ai_client.md §"State"`, the per-provider-lock pattern is the established convention.
|
||||
- `docs/guide_mcp_client.md` — current MCP client architecture; will be updated in Phase 5. Per the 2026-06-08 docs refresh, `guide_mcp_client.md` documents the 3-layer security model (Allowlist Construction → Path Validation → Resolution Gate) that the mcp_client refactor must preserve. The new `Result` return type must not weaken the 3 layers.
|
||||
- `docs/guide_state_lifecycle.md` — added 2026-06-08. The 3 per-thread + 7-lock pattern documented in §4 ("State Synchronization Across Threads") is what the `ai_client` refactor's state-delegation regression tests must exercise.
|
||||
- `docs/guide_discussions.md` — added 2026-06-08. The 23-operation matrix (A1-A7 + B1-B11 + C1-C5) is the *user-facing* source of truth for what the per-entry edit operations do. The provider-history-divergence issue (Pitfall #4 from the nagent_review) is exactly that: user edits `disc_entries[i].content` via A1, but `ai_client._<provider>_history` is not updated. The follow-up `public_api_migration_20260606` is the natural moment to fix this.
|
||||
- `docs/guide_context_aggregation.md` — added 2026-06-08. The `aggregate.py:109 build_discussion_section` consumes the `disc_entries` list. If the entries are edited via A1, the section regenerates correctly. If the provider history is *not* updated, the next LLM call still sees the old history. The `Result` pattern from this track is the natural carrier for the "diverged" signal.
|
||||
- `conductor/tracks/qwen_llama_grok_integration_20260606/` — the previous track that introduced the "data-oriented" framing; this track extends that philosophy to error handling. The qwen track's `send_openai_compatible()` helper is *expected* to return `Result` from day 1 (per the coordination note in the qwen spec §3.1) — this is a real concrete dependency.
|
||||
- `conductor/tracks/mcp_architecture_refactor_20260606/` — the next major track (after this one). Each sub-MCP's `invoke()` returns `Result[str, ErrorInfo]` per the mcp spec; this track defines the `Result` type that the mcp refactor uses. Coordination: this track ships *before* the mcp refactor can ship Phase 4 (extract Python) onward.
|
||||
- `conductor/tracks/nagent_review_20260608/report.md` — added 2026-06-08. §15 Pitfalls #2 and #4 (per-provider history globals, stateful singleton) and Pitfall #9 (sub-conversations) inform this track's risk register. Pitfall #4 specifically motivates the new `ErrorKind.PROVIDER_HISTORY_DIVERGED_FROM_UI` kind.
|
||||
- `conductor/tracks/nagent_review_20260608/nagent_takeaways_20260608.md` — added 2026-06-08. §9 ("Edit-the-input, not the output") describes the same provider-history-divergence problem; the `Result` pattern + the new error kind are the data-oriented solution.
|
||||
- `conductor/tracks/test_batching_refactor_20260606/` — the previous track that established the "tier-based" pattern; this track uses the same convention format (spec + metadata + state + plan).
|
||||
|
||||
### 12.4 External References
|
||||
|
||||
Reference in New Issue
Block a user