Private
Public Access
0
0
Commit Graph

199 Commits

Author SHA1 Message Date
ed d86131d951 conductor(plan): Mark Task 5.2 + 5.3 complete (Phase 5 verification)
Final grep: 0 send_result in active code. 3 historical refs in
error_handling.md (intentional, in the 'Historical deprecation' note).

Test verification: 100/101 tests pass in the 26 files renamed by this
track. 1 pre-existing failure in test_headless_service.py due to
missing credentials.toml (verified against origin/master baseline
where it also fails - unrelated to the rename).
2026-06-17 01:14:24 -04:00
ed ea7d794a6b conductor(plan): Mark Task 5.2 + 5.3 complete (Phase 5 verification done)
Final grep: 0 send_result in active code. 3 historical refs in
error_handling.md (intentional, in the 'Historical deprecation' note).

Test verification: 100/101 tests pass in the 26 files renamed by this
track. 1 pre-existing failure in test_headless_service.py due to
missing credentials.toml (verified against origin/master baseline
where it also fails - unrelated to the rename).

7 broader suite failures all pre-existing (all FileNotFoundError on
credentials.toml, confirmed against origin/master baseline).

Track verification:
- git grep send_result: 0 in active code (3 historical intentional)
- Full test suite: matches pre-rename baseline (7 pre-existing failures
  unrelated to the rename, 0 new regressions)
2026-06-17 01:13:25 -04:00
ed 5cc422b34b conductor(plan): Mark Task 5.1 complete (Phase 5 docs done) 2026-06-17 00:51:07 -04:00
ed 9b5011231c docs(ai_client): rename send_result to send in 3 current docs
Doc consistency: guide_ai_client.md, guide_app_controller.md, and
the error_handling styleguide now reference the new symbol name.

Also fixes two consistency issues in error_handling.md introduced by
the mechanical rename:
1. The 'Deprecation: send -> send_result' section (lines 623-642) was
   rewritten as a 'Historical deprecation (added 2026-06-15, reverted
   2026-06-16)' note that points to the relevant track specs.
2. Line 204 (the 'Current State Audit' summary for src/ai_client.py)
   had a self-contradictory claim ('send() is the new public API;
   send() is @deprecated') after the rename. Updated to describe
   the canonical public API.

Historical archives (conductor/tracks/*/spec.md, conductor/tracks/*/plan.md,
docs/reports/*) are NOT modified - they document the 2026-06-15
public_api_migration decision and stay as historical record.
2026-06-17 00:50:36 -04:00
ed d17d8743dd conductor(plan): Mark Task 4.1 complete (Phase 4 done) 2026-06-17 00:45:44 -04:00
ed ada9617308 test(ai_client): rename send_result to send in 22 remaining test files
Batch rename of 22 test files. 62 references renamed total.

The full test suite is now GREEN again, matching the pre-rename baseline
from Task 1.1. Pure mechanical rename. No behavior change.

Files affected: test_ai_cache_tracking, test_ai_client_cli,
test_ai_client_result, test_api_events, test_context_pruner,
test_deepseek_provider, test_gemini_cli_* (3 files), test_gui2_mcp,
test_headless_* (2 files), test_live_gui_integration_v2,
test_orchestration_logic, test_phase6_engine, test_rag_integration,
test_run_worker_lifecycle_abort, test_spawn_interception_v2,
test_symbol_parsing, test_tier4_interceptor, test_tiered_aggregation,
test_token_usage.

Note: spec estimated 24 files; actual is 22 (test_deprecation_warnings
no longer exists, and 1 fewer file than spec's list).

Refs: conductor/tracks/send_result_to_send_20260616/
2026-06-17 00:38:29 -04:00
ed 2f45bc4d68 conductor(plan): Mark Task 3.5 + 3.6 complete (Phase 3 done) 2026-06-17 00:35:32 -04:00
ed 53b35de5c6 conductor(plan): Mark Task 3.4 complete 2026-06-17 00:34:00 -04:00
ed 58fe3a9cb5 conductor(plan): Mark Task 3.3 complete 2026-06-17 00:33:00 -04:00
ed 6dbba46a25 conductor(plan): Mark Task 3.2 complete 2026-06-17 00:31:33 -04:00
ed f0663fda6a conductor(plan): Mark Task 3.1 complete 2026-06-17 00:29:54 -04:00
ed 3e2b4f74ba test(ai_client): rename send_result to send in test_conductor_engine_v2
22 references renamed (mostly monkeypatch.setattr calls + comments).
Test file state: GREEN. All 10 tests in this file now pass.
2026-06-17 00:29:21 -04:00
ed d714d10fd4 conductor(plan): Mark Task 2.1 complete 2026-06-17 00:28:17 -04:00
ed d87d909f7b refactor(ai_client): rename send_result to send in 5 src/ call sites
Renames 10 references across app_controller, conductor_tech_lead,
mcp_client (docstring example), multi_agent_conductor, orchestrator_pm.

5 call sites in ai_client.send_result(...) -> ai_client.send(...)
3 print strings mentioning send_result
1 docstring comment (conductor_tech_lead)
1 docstring example (mcp_client) 'src.ai_client.send_result' -> 'src.ai_client.send'

Test suite state: still red, but all src/-level call sites are now
renamed. Remaining failures are in test files (mocks and patches
that still reference send_result).

Refs: conductor/tracks/send_result_to_send_20260616/
2026-06-17 00:27:47 -04:00
ed 4a59567939 conductor(plan): Mark Task 1.1 complete 2026-06-17 00:26:05 -04:00
ed 5351389fc0 refactor(ai_client): rename send_result to send (the impl, TDD red moment)
The TDD red moment. The implementation is renamed but the call sites
in src/, tests/, and docs still use send_result. Subsequent commits
rename the call sites and progressively move the test suite back to
green.

10 references renamed in src/ai_client.py:
- 4 'Called by: send_result' docstring tags in private provider helpers
- 1 function definition (def send_result -> def send)
- 1 [C: ...] SDM tag referencing test function names
- 2 monitor component names (start_component / end_component)
- 2 error source strings (CONFIG + INTERNAL)

Also adds scripts/tier2/apply_t1_1_edits.py - the helper script that
applied the 10 edits. Kept in scripts/tier2/ as a record of the
mechanical change pattern.

Refs: conductor/tracks/send_result_to_send_20260616/
2026-06-17 00:23:16 -04:00
ed cba5457b9d feat(tier2): add run_tier2_sandboxed.ps1 launcher with restricted token (skeleton) 2026-06-16 19:49:47 -04:00
ed a9be60ae50 feat(tier2): add setup_tier2_clone.ps1 bootstrap script with -WhatIf support 2026-06-16 19:47:06 -04:00
ed 796da0de60 feat(tier2): add run_track.py CLI with init/status/report modes + git fetch/switch 2026-06-16 19:27:08 -04:00
ed 73ab2778ca feat(report): implement write_failure_report + 8 tests, 100% coverage 2026-06-16 19:13:30 -04:00
ed 190766fe03 feat(failcount): add default failcount.toml thresholds 2026-06-16 19:01:31 -04:00
ed fc92e1aa74 feat(failcount): add FailcountState + FailcountConfig dataclasses + all stub functions 2026-06-16 18:59:38 -04:00
ed 9f2ff29c2e feat(tier2): create scripts/tier2/ package 2026-06-16 18:57:09 -04:00
ed b90d4bdd4e feat(scripts): add --ci alias for --strict + CI-gate doc updates 2026-06-16 10:40:21 -04:00
ed 4521a7df96 feat(scripts): add --summary and --by-size modes to exception_handling audit 2026-06-16 09:41:20 -04:00
ed 9a04153abd feat(scripts): add exception_handling audit script (10-category classification) 2026-06-16 09:06:25 -04:00
ed 125a226525 was called rest 2026-06-15 20:10:18 -04:00
ed 48b47d250c oops 2026-06-15 20:04:35 -04:00
ed 4419922bce review batch script 2026-06-15 20:02:36 -04:00
ed c00161a13d Adjust audi_line_count.py to take into account doc strings 2026-06-12 22:47:58 -04:00
ed 1577cca568 fix(audit): remove stale 'gemini_native' from deferred-vendors exclusion
The previous exclusion list had 'gemini_native' which is
NOT a real function name in src/ai_client.py. The actual
function is _send_gemini_cli (already migrated to
run_with_tool_loop via send_func + on_pre_dispatch in
commit 4748d134).

The current deferred vendors are now correctly:
  - anthropic (uses anthropic SDK)
  - gemini (uses google-genai streaming)
  - deepseek (uses requests.post)

These will be addressed in Phase 5 t5_6/7/8. When those
ship, the DEFERRED_VENDORS frozenset should be emptied
so the audit gates the migration.

Verified: script still passes; gemini_cli's run_with_tool_loop
usage is detected correctly.
2026-06-11 21:30:04 -04:00
ed be5056051a feat(audit): add scripts/audit_providers_source_of_truth.py
Phase 2 task 2.4 (the script part). The script enforces:
PROVIDERS is declared as a literal only in src/ai_client.py.
The __getattr__ re-export in src/models.py is allowed (it
lazy-imports, not a literal declaration).

Catches the literal pattern 'PROVIDERS: List[str] = ['
specifically, which the __getattr__ re-export does not
match.

OK: passes against current state where PROVIDERS is
declared only in src/ai_client.py:56.
2026-06-11 16:44:59 -04:00
ed 7e4503f4e8 feat(audit): add scripts/audit_no_inline_tool_loops.py + state.toml Phase 1 progress
Task 1.8 (the plan's numbering: 'Add audit script'). Audit checks
that no _send_<vendor> in src/ai_client.py contains an inline
'for round_idx in range(MAX_TOOL_ROUNDS' loop. The audit excludes
the 4 vendored-call-path vendors (anthropic, gemini, gemini_native,
deepseek) which are documented in state.toml's deferred_work
section as future work (they use their own SDKs and need
separate per-vendor conversion to OpenAICompatibleRequest).

state.toml:
- t1_7 (Apply to 4 inline-loop vendors): completed for
  _send_gemini_cli only. Anthropic + Gemini + DeepSeek deferred.
- t1_8 (Add audit script): in_progress.
- t1_7 reuses commit 4748d134 (the send_func + on_pre_dispatch
  refactor that introduced the new helper pattern for
  vendored call paths).

OK: audit passes against the current 4 OpenAI-compat vendors
(minimax, grok, llama, qwen still uses _dashscope_call but
has no inline loop) + gemini_cli.
2026-06-11 16:17:23 -04:00
ed 749120d239 feat(audit): flag hardcoded workspace and project-root paths in tests 2026-06-09 17:01:14 -04:00
ed 488ae04459 fix(run_tests_batched): detect batch failure from output when proc.returncode is wrong 2026-06-08 02:03:50 -04:00
ed 5c6eb620a1 fix(run_tests_batched): colorize non-xdist format (tests/... STATUS), filter 'Error during log pruning' noise 2026-06-08 01:54:56 -04:00
ed 272b7841ae fix(run_tests_batched): filter xdist scheduling queue output (test paths without status prefix) 2026-06-08 01:51:07 -04:00
ed a2d16541d0 fix(run_tests_batched): keep pytest's full -v output, only filter LogPruner/win errors, colorize per-test status 2026-06-08 01:49:39 -04:00
ed 21cb57b31d fix(run_tests_batched): graceful xdist fallback, live progress streaming, ANSI colors, absolute default paths 2026-06-08 01:28:53 -04:00
ed 50f26f0d5c chore: delete legacy run_tests_batched.py (was preserved for one cycle) 2026-06-08 01:15:12 -04:00
ed e6ad2ecda2 chore: preserve old run_tests_batched.py as .legacy for one cycle 2026-06-08 00:59:49 -04:00
ed 2c3a0512f2 feat(run_tests_batched): full CLI with --tiers, --durations, actual pytest execution 2026-06-08 00:58:53 -04:00
ed 57285d048b feat(run_tests_batched): add --plan and --audit modes (Phase 1 stub) 2026-06-08 00:50:37 -04:00
ed dd48c095b8 refactor(tests): move test_categorizer library from scripts/ to tests/ 2026-06-08 00:15:19 -04:00
ed 4d6464324f feat(scripts): add CategoryRecord data model for test categorization 2026-06-08 00:11:22 -04:00
ed 746dde8286 push latest related to default layout 2026-06-07 23:50:24 -04:00
ed 7bcb5a8c07 refactor(config): Route all config I/O through AppController
Eliminates 22 call sites that bypassed the AppController state owner
and read/wrote config.toml directly. AppController is now the single
source of truth for self.config; gui_2.py, commands.py, etc. go
through controller.save_config() / controller.load_config().

Production changes:
- src/models.py: rename load_config -> _load_config_from_disk,
  save_config -> _save_config_to_disk (private I/O primitives)
- src/app_controller.py: add public load_config()/save_config() methods
  that own the state. Update 3 internal call sites and 3 ConductorEngine
  call sites to pass max_workers from self.config
- src/multi_agent_conductor.py: ConductorEngine.__init__ now takes
  max_workers as a parameter (caller responsibility, not I/O primitive)
- src/external_editor.py: get_default_launcher() takes config as a
  parameter; gui_2.py:1311,4776 pass app.config
- src/gui_2.py: 17 sites of models.save_config(X.config) replaced with
  X.save_config() (delegates via __getattr__ to controller)
- src/commands.py: save_all() uses app.save_config()

Test changes (route through controller, not I/O primitive):
- tests/conftest.py: mock_app and app_instance fixtures now patch
  AppController.load_config/save_config instead of models I/O primitives
- 18 other test files: patches renamed from models._save_config_to_disk
  to AppController.save_config (and same for load_config)
- tests/test_app_controller_mcp.py: use SLOP_CONFIG env var instead of
  patching removed CONFIG_PATH module constant
- tests/test_parallel_execution.py: pass max_workers=2 explicitly to
  ConductorEngine (caller no longer reads config)
- tests/test_gui_paths.py: add save_config=MagicMock() to MockApp;
  assert on controller method, not I/O primitive
- tests/test_models_no_top_level_tomli_w.py: still calls private
  _save_config_to_disk directly (the only allowed exception; tests
  the lazy-load behavior of the primitive itself)

New files:
- scripts/audit_no_models_config_io.py: enforces the rule (--strict,
  --json modes; AST-based docstring detection to avoid false positives)
- conductor/code_styleguides/config_state_owner.md: documents the rule

Verification:
- 67 targeted tests pass
- scripts/audit_no_models_config_io.py --strict returns 0

This is the architectural cleanup that surfaced during the
audit_architectural_cheats_20260607 review. Closes the smoke-gun
CONFIG_PATH module constant (already done in 0c7ebf22) AND the
free-function models.load_config/save_config smell.

[conductor(checkpoint): config-iO-refactor-20260607]
2026-06-07 19:54:17 -04:00
ed a7ab994f30 chore(audit): add --strict mode + baseline file (CI gate)
scripts/audit_license_cve.baseline.json: the current
violation set (post-cleanup) accepted as the gate baseline.
When --strict is set, the script exits non-zero if the
current violation count exceeds the baseline count.

To regenerate the baseline after an intentional change
(e.g., adding a new dep with an acceptable license), run:
  uv run python -m scripts.audit_license_cve --dump-baseline

Also fixes the baseline path: it now lives next to the script
(Path(__file__).parent) instead of the wrong location under
docs/reports/scripts/. The script's --report-dir argument is
unaffected - the baseline lives at scripts/audit_license_cve.baseline.json
regardless of the report directory.

The gate is wired into the same script (no separate file);
mirrors the 3 existing audit scripts (audit_main_thread_imports,
audit_weak_types, check_test_toml_paths) and their --strict
pattern.

28 unit + integration tests passing.
2026-06-07 15:24:57 -04:00
ed 20fa355838 chore(deps): tilde-pin all deps; delete requirements.txt
Every direct dep in pyproject.toml now has a ~X.Y.Z bound
(patch-only). The 7 unconstrained deps (imgui-bundle,
anthropic, google-genai, openai, fastapi, mcp, uvicorn,
plus tomli-w) get explicit tilde bounds discovered from
uv.lock. The 6 >=X.Y.Z deps are normalized to tilde-style
(pinned to the current lock version).

The local-rag optional dep (sentence-transformers) is also
tilde-pinned.

requirements.txt is deleted (was redundant with uv.lock;
the uv project uses uv.lock as the canonical lock file,
which is regenerated locally and gitignored per project
policy at .gitignore:9).

Re-running the audit confirms 0 PIN_VIOLATION (was 7). The
final.md report records the post-cleanup state.

Also adds --report-name CLI flag to the audit script
(default 'initial') so the script can write either
initial.md (Phase 1) or final.md (Phase 2) into the same
report directory.
2026-06-07 15:15:30 -04:00
ed a8ae11d3a8 chore(audit): add license_cve audit script + initial report
scripts/audit_license_cve.py: 4 internal checks (license +
CVE + pin + source-header), policy tables (allowlist of
permissive/weak-copyleft/public-domain, blocklist of
non-OSI/restricted-source), and a main() that runs all 4
and emits line-per-violation to stdout + a markdown report.

Tests (26 unit + integration) cover license classifier (16
variants across MIT, BSD, Apache, LGPL, MPL, CC0, WTFPL,
GPL, AGPL, SSPL, BSL, Commons Clause, Elastic, Anti-996,
Hippocratic, unknown), pin check (3), source-header check
(3), license check via importlib.metadata (1), CVE check
via subprocess pip-audit (2), and a smoke test of the main
loop (1).

No new pip deps in the project: pure stdlib
(importlib.metadata, tomllib, pathlib, re) + subprocess to
pip-audit (optional dev tool, installed via 'uv tool install
pip-audit' if user wants CVE checks).

Initial report at docs/reports/license_cve_audit/2026-06-07/
records the current state. The Phase 2 commit will apply
the fixes (tilde-pin, delete requirements.txt); the Phase 3
commit will add --strict mode + baseline file for CI.
2026-06-07 15:07:46 -04:00