manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	d17d8743dd	conductor(plan): Mark Task 4.1 complete (Phase 4 done)	2026-06-17 00:45:44 -04:00
ed	ada9617308	test(ai_client): rename send_result to send in 22 remaining test files Batch rename of 22 test files. 62 references renamed total. The full test suite is now GREEN again, matching the pre-rename baseline from Task 1.1. Pure mechanical rename. No behavior change. Files affected: test_ai_cache_tracking, test_ai_client_cli, test_ai_client_result, test_api_events, test_context_pruner, test_deepseek_provider, test_gemini_cli_* (3 files), test_gui2_mcp, test_headless_* (2 files), test_live_gui_integration_v2, test_orchestration_logic, test_phase6_engine, test_rag_integration, test_run_worker_lifecycle_abort, test_spawn_interception_v2, test_symbol_parsing, test_tier4_interceptor, test_tiered_aggregation, test_token_usage. Note: spec estimated 24 files; actual is 22 (test_deprecation_warnings no longer exists, and 1 fewer file than spec's list). Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:38:29 -04:00
ed	2f45bc4d68	conductor(plan): Mark Task 3.5 + 3.6 complete (Phase 3 done)	2026-06-17 00:35:32 -04:00
ed	53b35de5c6	conductor(plan): Mark Task 3.4 complete	2026-06-17 00:34:00 -04:00
ed	58fe3a9cb5	conductor(plan): Mark Task 3.3 complete	2026-06-17 00:33:00 -04:00
ed	6dbba46a25	conductor(plan): Mark Task 3.2 complete	2026-06-17 00:31:33 -04:00
ed	f0663fda6a	conductor(plan): Mark Task 3.1 complete	2026-06-17 00:29:54 -04:00
ed	3e2b4f74ba	test(ai_client): rename send_result to send in test_conductor_engine_v2 22 references renamed (mostly monkeypatch.setattr calls + comments). Test file state: GREEN. All 10 tests in this file now pass.	2026-06-17 00:29:21 -04:00
ed	d714d10fd4	conductor(plan): Mark Task 2.1 complete	2026-06-17 00:28:17 -04:00
ed	d87d909f7b	refactor(ai_client): rename send_result to send in 5 src/ call sites Renames 10 references across app_controller, conductor_tech_lead, mcp_client (docstring example), multi_agent_conductor, orchestrator_pm. 5 call sites in ai_client.send_result(...) -> ai_client.send(...) 3 print strings mentioning send_result 1 docstring comment (conductor_tech_lead) 1 docstring example (mcp_client) 'src.ai_client.send_result' -> 'src.ai_client.send' Test suite state: still red, but all src/-level call sites are now renamed. Remaining failures are in test files (mocks and patches that still reference send_result). Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:27:47 -04:00
ed	4a59567939	conductor(plan): Mark Task 1.1 complete	2026-06-17 00:26:05 -04:00
ed	5351389fc0	refactor(ai_client): rename send_result to send (the impl, TDD red moment) The TDD red moment. The implementation is renamed but the call sites in src/, tests/, and docs still use send_result. Subsequent commits rename the call sites and progressively move the test suite back to green. 10 references renamed in src/ai_client.py: - 4 'Called by: send_result' docstring tags in private provider helpers - 1 function definition (def send_result -> def send) - 1 [C: ...] SDM tag referencing test function names - 2 monitor component names (start_component / end_component) - 2 error source strings (CONFIG + INTERNAL) Also adds scripts/tier2/apply_t1_1_edits.py - the helper script that applied the 10 edits. Kept in scripts/tier2/ as a record of the mechanical change pattern. Refs: conductor/tracks/send_result_to_send_20260616/	2026-06-17 00:23:16 -04:00
ed	cba5457b9d	feat(tier2): add run_tier2_sandboxed.ps1 launcher with restricted token (skeleton)	2026-06-16 19:49:47 -04:00
ed	a9be60ae50	feat(tier2): add setup_tier2_clone.ps1 bootstrap script with -WhatIf support	2026-06-16 19:47:06 -04:00
ed	796da0de60	feat(tier2): add run_track.py CLI with init/status/report modes + git fetch/switch	2026-06-16 19:27:08 -04:00
ed	73ab2778ca	feat(report): implement write_failure_report + 8 tests, 100% coverage	2026-06-16 19:13:30 -04:00
ed	190766fe03	feat(failcount): add default failcount.toml thresholds	2026-06-16 19:01:31 -04:00
ed	fc92e1aa74	feat(failcount): add FailcountState + FailcountConfig dataclasses + all stub functions	2026-06-16 18:59:38 -04:00
ed	9f2ff29c2e	feat(tier2): create scripts/tier2/ package	2026-06-16 18:57:09 -04:00
ed	b90d4bdd4e	feat(scripts): add --ci alias for --strict + CI-gate doc updates	2026-06-16 10:40:21 -04:00
ed	4521a7df96	feat(scripts): add --summary and --by-size modes to exception_handling audit	2026-06-16 09:41:20 -04:00
ed	9a04153abd	feat(scripts): add exception_handling audit script (10-category classification)	2026-06-16 09:06:25 -04:00
ed	125a226525	was called rest	2026-06-15 20:10:18 -04:00
ed	48b47d250c	oops	2026-06-15 20:04:35 -04:00
ed	4419922bce	review batch script	2026-06-15 20:02:36 -04:00
ed	c00161a13d	Adjust audi_line_count.py to take into account doc strings	2026-06-12 22:47:58 -04:00
ed	1577cca568	fix(audit): remove stale 'gemini_native' from deferred-vendors exclusion The previous exclusion list had 'gemini_native' which is NOT a real function name in src/ai_client.py. The actual function is _send_gemini_cli (already migrated to run_with_tool_loop via send_func + on_pre_dispatch in commit `4748d134`). The current deferred vendors are now correctly: - anthropic (uses anthropic SDK) - gemini (uses google-genai streaming) - deepseek (uses requests.post) These will be addressed in Phase 5 t5_6/7/8. When those ship, the DEFERRED_VENDORS frozenset should be emptied so the audit gates the migration. Verified: script still passes; gemini_cli's run_with_tool_loop usage is detected correctly.	2026-06-11 21:30:04 -04:00
ed	be5056051a	feat(audit): add scripts/audit_providers_source_of_truth.py Phase 2 task 2.4 (the script part). The script enforces: PROVIDERS is declared as a literal only in src/ai_client.py. The __getattr__ re-export in src/models.py is allowed (it lazy-imports, not a literal declaration). Catches the literal pattern 'PROVIDERS: List[str] = [' specifically, which the __getattr__ re-export does not match. OK: passes against current state where PROVIDERS is declared only in src/ai_client.py:56.	2026-06-11 16:44:59 -04:00
ed	7e4503f4e8	feat(audit): add scripts/audit_no_inline_tool_loops.py + state.toml Phase 1 progress Task 1.8 (the plan's numbering: 'Add audit script'). Audit checks that no _send_<vendor> in src/ai_client.py contains an inline 'for round_idx in range(MAX_TOOL_ROUNDS' loop. The audit excludes the 4 vendored-call-path vendors (anthropic, gemini, gemini_native, deepseek) which are documented in state.toml's deferred_work section as future work (they use their own SDKs and need separate per-vendor conversion to OpenAICompatibleRequest). state.toml: - t1_7 (Apply to 4 inline-loop vendors): completed for _send_gemini_cli only. Anthropic + Gemini + DeepSeek deferred. - t1_8 (Add audit script): in_progress. - t1_7 reuses commit `4748d134` (the send_func + on_pre_dispatch refactor that introduced the new helper pattern for vendored call paths). OK: audit passes against the current 4 OpenAI-compat vendors (minimax, grok, llama, qwen still uses _dashscope_call but has no inline loop) + gemini_cli.	2026-06-11 16:17:23 -04:00
ed	749120d239	feat(audit): flag hardcoded workspace and project-root paths in tests	2026-06-09 17:01:14 -04:00
ed	488ae04459	fix(run_tests_batched): detect batch failure from output when proc.returncode is wrong	2026-06-08 02:03:50 -04:00
ed	5c6eb620a1	fix(run_tests_batched): colorize non-xdist format (tests/... STATUS), filter 'Error during log pruning' noise	2026-06-08 01:54:56 -04:00
ed	272b7841ae	fix(run_tests_batched): filter xdist scheduling queue output (test paths without status prefix)	2026-06-08 01:51:07 -04:00
ed	a2d16541d0	fix(run_tests_batched): keep pytest's full -v output, only filter LogPruner/win errors, colorize per-test status	2026-06-08 01:49:39 -04:00
ed	21cb57b31d	fix(run_tests_batched): graceful xdist fallback, live progress streaming, ANSI colors, absolute default paths	2026-06-08 01:28:53 -04:00
ed	50f26f0d5c	chore: delete legacy run_tests_batched.py (was preserved for one cycle)	2026-06-08 01:15:12 -04:00
ed	e6ad2ecda2	chore: preserve old run_tests_batched.py as .legacy for one cycle	2026-06-08 00:59:49 -04:00
ed	2c3a0512f2	feat(run_tests_batched): full CLI with --tiers, --durations, actual pytest execution	2026-06-08 00:58:53 -04:00
ed	57285d048b	feat(run_tests_batched): add --plan and --audit modes (Phase 1 stub)	2026-06-08 00:50:37 -04:00
ed	dd48c095b8	refactor(tests): move test_categorizer library from scripts/ to tests/	2026-06-08 00:15:19 -04:00
ed	4d6464324f	feat(scripts): add CategoryRecord data model for test categorization	2026-06-08 00:11:22 -04:00
ed	746dde8286	push latest related to default layout	2026-06-07 23:50:24 -04:00
ed	7bcb5a8c07	refactor(config): Route all config I/O through AppController Eliminates 22 call sites that bypassed the AppController state owner and read/wrote config.toml directly. AppController is now the single source of truth for self.config; gui_2.py, commands.py, etc. go through controller.save_config() / controller.load_config(). Production changes: - src/models.py: rename load_config -> _load_config_from_disk, save_config -> _save_config_to_disk (private I/O primitives) - src/app_controller.py: add public load_config()/save_config() methods that own the state. Update 3 internal call sites and 3 ConductorEngine call sites to pass max_workers from self.config - src/multi_agent_conductor.py: ConductorEngine.__init__ now takes max_workers as a parameter (caller responsibility, not I/O primitive) - src/external_editor.py: get_default_launcher() takes config as a parameter; gui_2.py:1311,4776 pass app.config - src/gui_2.py: 17 sites of models.save_config(X.config) replaced with X.save_config() (delegates via __getattr__ to controller) - src/commands.py: save_all() uses app.save_config() Test changes (route through controller, not I/O primitive): - tests/conftest.py: mock_app and app_instance fixtures now patch AppController.load_config/save_config instead of models I/O primitives - 18 other test files: patches renamed from models._save_config_to_disk to AppController.save_config (and same for load_config) - tests/test_app_controller_mcp.py: use SLOP_CONFIG env var instead of patching removed CONFIG_PATH module constant - tests/test_parallel_execution.py: pass max_workers=2 explicitly to ConductorEngine (caller no longer reads config) - tests/test_gui_paths.py: add save_config=MagicMock() to MockApp; assert on controller method, not I/O primitive - tests/test_models_no_top_level_tomli_w.py: still calls private _save_config_to_disk directly (the only allowed exception; tests the lazy-load behavior of the primitive itself) New files: - scripts/audit_no_models_config_io.py: enforces the rule (--strict, --json modes; AST-based docstring detection to avoid false positives) - conductor/code_styleguides/config_state_owner.md: documents the rule Verification: - 67 targeted tests pass - scripts/audit_no_models_config_io.py --strict returns 0 This is the architectural cleanup that surfaced during the audit_architectural_cheats_20260607 review. Closes the smoke-gun CONFIG_PATH module constant (already done in `0c7ebf22`) AND the free-function models.load_config/save_config smell. [conductor(checkpoint): config-iO-refactor-20260607]	2026-06-07 19:54:17 -04:00
ed	a7ab994f30	chore(audit): add --strict mode + baseline file (CI gate) scripts/audit_license_cve.baseline.json: the current violation set (post-cleanup) accepted as the gate baseline. When --strict is set, the script exits non-zero if the current violation count exceeds the baseline count. To regenerate the baseline after an intentional change (e.g., adding a new dep with an acceptable license), run: uv run python -m scripts.audit_license_cve --dump-baseline Also fixes the baseline path: it now lives next to the script (Path(__file__).parent) instead of the wrong location under docs/reports/scripts/. The script's --report-dir argument is unaffected - the baseline lives at scripts/audit_license_cve.baseline.json regardless of the report directory. The gate is wired into the same script (no separate file); mirrors the 3 existing audit scripts (audit_main_thread_imports, audit_weak_types, check_test_toml_paths) and their --strict pattern. 28 unit + integration tests passing.	2026-06-07 15:24:57 -04:00
ed	20fa355838	chore(deps): tilde-pin all deps; delete requirements.txt Every direct dep in pyproject.toml now has a ~X.Y.Z bound (patch-only). The 7 unconstrained deps (imgui-bundle, anthropic, google-genai, openai, fastapi, mcp, uvicorn, plus tomli-w) get explicit tilde bounds discovered from uv.lock. The 6 >=X.Y.Z deps are normalized to tilde-style (pinned to the current lock version). The local-rag optional dep (sentence-transformers) is also tilde-pinned. requirements.txt is deleted (was redundant with uv.lock; the uv project uses uv.lock as the canonical lock file, which is regenerated locally and gitignored per project policy at .gitignore:9). Re-running the audit confirms 0 PIN_VIOLATION (was 7). The final.md report records the post-cleanup state. Also adds --report-name CLI flag to the audit script (default 'initial') so the script can write either initial.md (Phase 1) or final.md (Phase 2) into the same report directory.	2026-06-07 15:15:30 -04:00
ed	a8ae11d3a8	chore(audit): add license_cve audit script + initial report scripts/audit_license_cve.py: 4 internal checks (license + CVE + pin + source-header), policy tables (allowlist of permissive/weak-copyleft/public-domain, blocklist of non-OSI/restricted-source), and a main() that runs all 4 and emits line-per-violation to stdout + a markdown report. Tests (26 unit + integration) cover license classifier (16 variants across MIT, BSD, Apache, LGPL, MPL, CC0, WTFPL, GPL, AGPL, SSPL, BSL, Commons Clause, Elastic, Anti-996, Hippocratic, unknown), pin check (3), source-header check (3), license check via importlib.metadata (1), CVE check via subprocess pip-audit (2), and a smoke test of the main loop (1). No new pip deps in the project: pure stdlib (importlib.metadata, tomllib, pathlib, re) + subprocess to pip-audit (optional dev tool, installed via 'uv tool install pip-audit' if user wants CVE checks). Initial report at docs/reports/license_cve_audit/2026-06-07/ records the current state. The Phase 2 commit will apply the fixes (tilde-pin, delete requirements.txt); the Phase 3 commit will add --strict mode + baseline file for CI.	2026-06-07 15:07:46 -04:00
ed	0d12396011	increase default test batch size	2026-06-07 13:57:39 -04:00
ed	955b61df78	fix(tests): revert watchdog to os._exit(0); runner uses subprocess timeout The os._exit(2) change in `719c5e27` introduced a regression: the watchdog's daemon thread continues running through pytest's interpreter shutdown. On EVERY batch (even ones that complete successfully in 17s), the watchdog's time.sleep(30.0) elapses during finalization and the thread calls os._exit(2) just as pytest is wrapping up. Result: every batch was reported as 'Batch N failed' by run_tests_batched.py, even ones with '126 passed in 17.14s'. Revert watchdog to os._exit(0) — its original purpose (force-exit any stuck pytest at 30s) doesn't need a non-zero code; it's a sledgehammer, not a signal. The runner does its own failure detection. Update scripts/run_tests_batched.py to: - Use subprocess.run(timeout=180) per batch - Catch TimeoutExpired as a batch failure (with elapsed time + reason printed) - Catch CalledProcessError as a batch failure (preserved from before) - Print elapsed time for every batch (pass or fail) so hang behavior is visible - Print a final summary that lists all FAILED FILES (not batches) for easy re-running - Add --batch-size and --timeout CLI flags - Add 1-space indentation + type hints per project style Verified: ast.parse OK; --help works; test_conftest_watchdog 3/3 pass.	2026-06-07 12:59:27 -04:00
ed	5e1867bb50	feat(scripts): add cleanup_orphaned_processes.py for sloppy.py leftover cleanup After test runs that use live_gui, dozens of sloppy.py --enable-test-hooks processes can leak (the watchdog `e1c8730f` bounds the hang but doesn't kill the spawned GUI subprocesses). This script: - Enumerates all python.exe / uv.exe processes via CIM - Categorizes each by command-line content: - sloppy.py --enable-test-hooks -> KILL (orphans) - scripts/mcp_server.py -> PRESERVE (manual_slop's MCP server, used by opencode) - minimax-coding-plan-mcp -> PRESERVE (opencode's MCP server, used by opencode) - pytest runner / stuck App() test -> PRESERVE by default, kill with --kill-tests - Defaults to DRY-RUN; pass --kill to terminate - --kill-tests: also kill stuck test subprocesses - --kill-mcp: also kill MCP servers (off by default; usually DON'T want this) - --json: machine-readable output for CI/scripting Verified after a 10-batch test run: 28 sloppy.py orphans identified, 21 MCP servers (9 manual_slop + 12 minimax) preserved correctly. The watchdog fix (`e1c8730f`) bounds the test hang; this script cleans up the leaked GUI subprocesses afterward. Usage: uv run python scripts/cleanup_orphaned_processes.py # dry-run uv run python scripts/cleanup_orphaned_processes.py --kill # kill sloppy.py orphans uv run python scripts/cleanup_orphaned_processes.py --kill --kill-tests	2026-06-07 12:11:01 -04:00
ed	b94d949b4d	fix formatting on scripts	2026-06-07 11:51:36 -04:00

1 2 3 4