manual_slop

Private

Public Access

Author	SHA1	Message	Date
ed	5c6eb620a1	fix(run_tests_batched): colorize non-xdist format (tests/... STATUS), filter 'Error during log pruning' noise	2026-06-08 01:54:56 -04:00
ed	272b7841ae	fix(run_tests_batched): filter xdist scheduling queue output (test paths without status prefix)	2026-06-08 01:51:07 -04:00
ed	a2d16541d0	fix(run_tests_batched): keep pytest's full -v output, only filter LogPruner/win errors, colorize per-test status	2026-06-08 01:49:39 -04:00
ed	21cb57b31d	fix(run_tests_batched): graceful xdist fallback, live progress streaming, ANSI colors, absolute default paths	2026-06-08 01:28:53 -04:00
ed	fb6b4bd3eb	conductor(tracks): mark test_batching_refactor_20260606 as completed	2026-06-08 01:18:20 -04:00
ed	50bd894f8d	conductor(archive): ship test_batching_refactor_20260606 to archive	2026-06-08 01:16:58 -04:00
ed	50f26f0d5c	chore: delete legacy run_tests_batched.py (was preserved for one cycle)	2026-06-08 01:15:12 -04:00
ed	ac7e638b23	chore: gitignore tests/.test_durations.json (developer-local cache)	2026-06-08 01:14:51 -04:00
ed	9eac02ddcb	feat(tests): populate test_categories.toml with cross-cutting entries	2026-06-08 01:14:12 -04:00
ed	796eec0058	conductor(plan): mark Phases 2,3 complete in test_batching_refactor_20260606	2026-06-08 01:09:02 -04:00
ed	5252b6d782	docs(testing): document new run_tests_batched.py in Running Tests section	2026-06-08 01:00:50 -04:00
ed	e6ad2ecda2	chore: preserve old run_tests_batched.py as .legacy for one cycle	2026-06-08 00:59:49 -04:00
ed	2c3a0512f2	feat(run_tests_batched): full CLI with --tiers, --durations, actual pytest execution	2026-06-08 00:58:53 -04:00
ed	7610c9c1dc	conductor(plan): mark Phase 1 complete in test_batching_refactor_20260606	2026-06-08 00:53:59 -04:00
ed	57285d048b	feat(run_tests_batched): add --plan and --audit modes (Phase 1 stub)	2026-06-08 00:50:37 -04:00
ed	29ac64adc6	test(conftest): register tests.pytest_collection_order as pytest plugin	2026-06-08 00:49:11 -04:00
ed	f240504f0e	feat(collection_order): implement opt-in per-test sort via conftest hook	2026-06-08 00:47:21 -04:00
ed	6287005ad1	test(collection_order): add red tests for opt-in sort_items_by_order	2026-06-08 00:47:03 -04:00
ed	e07036ad5d	feat(batcher): implement Batch dataclass and plan() function	2026-06-08 00:46:12 -04:00
ed	246f293c56	test(batcher): add red tests for plan() function	2026-06-08 00:41:20 -04:00
ed	9c5ad3fb8d	config	2026-06-08 00:40:33 -04:00
ed	f778ef509e	feat(categorizer): implement load_registry, merge_registry, categorize_all	2026-06-08 00:33:21 -04:00
ed	2b56ab3c5c	conductor(track): initialize test_batching_post_refactor_polish_20260607 spec/plan/state	2026-06-08 00:27:32 -04:00
ed	828050ae4f	test(categorizer): add red tests for registry merge and full classification	2026-06-08 00:27:04 -04:00
ed	9e5fed56a5	feat(categorizer): implement subsystem/speed/batch_group inference	2026-06-08 00:22:22 -04:00
ed	7aaac7d586	test(categorizer): add red tests for subsystem/speed/batch_group inference	2026-06-08 00:21:03 -04:00
ed	b2e8cce9f6	feat(categorizer): implement auto_classify using AST scan (no regex)	2026-06-08 00:19:43 -04:00
ed	fb54737f45	test(categorizer): add red tests for auto_classify fixture_class rules	2026-06-08 00:16:18 -04:00
ed	dd48c095b8	refactor(tests): move test_categorizer library from scripts/ to tests/	2026-06-08 00:15:19 -04:00
ed	4d6464324f	feat(scripts): add CategoryRecord data model for test categorization	2026-06-08 00:11:22 -04:00
ed	746dde8286	push latest related to default layout	2026-06-07 23:50:24 -04:00
ed	2db1436130	TEST LAYOUT	2026-06-07 23:33:13 -04:00
ed	818537b3dd	feat(gui): Add layout staleness diagnostic on startup Adds a one-shot `_diag_layout_state` method that runs in `_post_init` and prints three lines to stderr: 1. `[GUI] show_windows entries: N, visible by default: M` — how many windows are defined vs. visible with no layout file. 2. `[GUI] visible-by-default windows: ...` — the names of windows that will appear on a fresh launch. 3. `[GUI] WARNING: layout has N stale window name(s) that no longer exist: ...` — when the on-disk manualslop_layout.ini references window names that the current code has dropped (Projects/Files/ Screenshots/Provider/Discussion History/etc. — all replaced by the hub pattern in earlier refactors). This addresses the user's observation that: - "the diagnostics panel still only shows itself" - "I see a flicker as if the layout got reset but cannot retain permanence" Both symptoms are caused by the repo-root manualslop_layout.ini referencing pre-hub-refactor window names that HelloImGui silently drops on load. The diagnostic surfaces the root cause in the test log so the user can see exactly which stale names are present, without having to manually diff the .ini file. Verified: log appears in `logs/sloppy_py_test.log` on the next live_gui test run, including the 11 default-visible windows and the staleness check.	2026-06-07 22:36:19 -04:00
ed	7a4f71e78b	test(fix): Don't copy stale repo-root layout to live_gui workspace The repo-root manualslop_layout.ini references pre-hub-refactor window names that no longer exist in the current code (Projects/Files/Screenshots/Provider/System Prompts/etc.). HelloImGui silently drops unknown windows when loading the layout, causing "missing panels" in live_gui tests and in the user's interactive session. The previous "Preserve GUI layout for tests" block copied the stale repo-root layout into the live_gui workspace, infecting every live_gui test session with stale state. Fix: skip the copy. HelloImui will generate a fresh layout in the test workspace on shutdown, which then lives in the session-scoped workspace and is cleaned up at teardown. The repo-root manualslop_layout.ini is still TRACKED (I did not delete it; that's the user's call). They can: - Delete it manually, or - Run the existing "Reset Layout" command from the Command Palette (which deletes both repo-root and live_gui_workspace paths and forces HelloImGui to regenerate with the current window catalog). Verified: 6/6 targeted tests pass.	2026-06-07 21:27:29 -04:00
ed	94cfb1b5ff	test(fix): Update tests to route config through AppController/env var Four test files had patches/monkeypatches that referenced the removed src.models.load_config or src.models.CONFIG_PATH module constant. These all stem from the config I/O refactor (commit `7bcb5a8c`) that renamed load_config/save_config to private I/O primitives. - tests/test_external_editor_gui.py: 2 sites changed from monkeypatch.setattr(models_module, 'load_config', ...) to monkeypatch.setattr('src.app_controller.AppController.load_config', ...) - tests/test_external_mcp_e2e.py: CONFIG_PATH monkeypatch changed to SLOP_CONFIG env var (the only supported override path) - tests/test_log_management_ui.py: same CONFIG_PATH -> SLOP_CONFIG fix - tests/test_gen_send_empty_context.py: _StubController now receives ui_selected_context_files and _pending_generation_action from the app_instance BEFORE being assigned as controller (App.__getattr__ delegates to controller, so attrs must be on the stub first) Also: deleted tests/artifacts/manualslop_layout.ini (gitignored stale file from March 4 referencing pre-refactor window names like "Projects"/"Files"/"Screenshots" that no longer exist in the code). Repo-root manualslop_layout.ini still references the same old window names; user should run the existing "Reset Layout" command (or delete it manually) to regenerate with the current window catalog (Context Hub / AI Settings Hub / Discussion Hub / etc.). Verified: 13 targeted tests pass: - test_external_editor_gui.py (5/5) - test_external_mcp_e2e.py (1/1) - test_log_management_ui.py (2/2) - test_gen_send_empty_context.py (5/5)	2026-06-07 21:21:38 -04:00
ed	7bcb5a8c07	refactor(config): Route all config I/O through AppController Eliminates 22 call sites that bypassed the AppController state owner and read/wrote config.toml directly. AppController is now the single source of truth for self.config; gui_2.py, commands.py, etc. go through controller.save_config() / controller.load_config(). Production changes: - src/models.py: rename load_config -> _load_config_from_disk, save_config -> _save_config_to_disk (private I/O primitives) - src/app_controller.py: add public load_config()/save_config() methods that own the state. Update 3 internal call sites and 3 ConductorEngine call sites to pass max_workers from self.config - src/multi_agent_conductor.py: ConductorEngine.__init__ now takes max_workers as a parameter (caller responsibility, not I/O primitive) - src/external_editor.py: get_default_launcher() takes config as a parameter; gui_2.py:1311,4776 pass app.config - src/gui_2.py: 17 sites of models.save_config(X.config) replaced with X.save_config() (delegates via __getattr__ to controller) - src/commands.py: save_all() uses app.save_config() Test changes (route through controller, not I/O primitive): - tests/conftest.py: mock_app and app_instance fixtures now patch AppController.load_config/save_config instead of models I/O primitives - 18 other test files: patches renamed from models._save_config_to_disk to AppController.save_config (and same for load_config) - tests/test_app_controller_mcp.py: use SLOP_CONFIG env var instead of patching removed CONFIG_PATH module constant - tests/test_parallel_execution.py: pass max_workers=2 explicitly to ConductorEngine (caller no longer reads config) - tests/test_gui_paths.py: add save_config=MagicMock() to MockApp; assert on controller method, not I/O primitive - tests/test_models_no_top_level_tomli_w.py: still calls private _save_config_to_disk directly (the only allowed exception; tests the lazy-load behavior of the primitive itself) New files: - scripts/audit_no_models_config_io.py: enforces the rule (--strict, --json modes; AST-based docstring detection to avoid false positives) - conductor/code_styleguides/config_state_owner.md: documents the rule Verification: - 67 targeted tests pass - scripts/audit_no_models_config_io.py --strict returns 0 This is the architectural cleanup that surfaced during the audit_architectural_cheats_20260607 review. Closes the smoke-gun CONFIG_PATH module constant (already done in `0c7ebf22`) AND the free-function models.load_config/save_config smell. [conductor(checkpoint): config-iO-refactor-20260607]	2026-06-07 19:54:17 -04:00
ed	5a1767e1d7	grammar	2026-06-07 18:17:26 -04:00
ed	bcca069c3b	t2 report	2026-06-07 18:08:04 -04:00
ed	0c7ebf2267	fix(models): remove module-level CONFIG_PATH; re-resolve on every call ROOT CAUSE: src/models.py had `CONFIG_PATH = get_config_path()` at module level. Every test that imported `src.models` and called `save_config()` or `load_config()` wrote/read the repo-root `config.toml` via this cached constant. The path was resolved once at import time, so the SLOP_CONFIG env var (or test fixtures) couldn't redirect reads/writes without reimporting the module. This silently corrupted the user's config.toml on every test run. The diff between runs showed: 'config.toml changed in working copy' — caused by tests, not the user. FIX: remove the module-level constant; call get_config_path() on every read/write call. SLOP_CONFIG (and any test-time set_config_path() helper) now works without reimport. Also: keep my prior commits to this file (reset_layout command in src/commands.py; the RUN_MMA_INTEGRATION skipif in test_mma_step_mode_sim.py) bundled here for a clean atomic fix-pack since the user just fixed the indentation issue I had. Verified: src.models imports cleanly; load_config/save_config work as expected. Tests that import these functions will use whatever SLOP_CONFIG points to (or the repo-root default).	2026-06-07 17:57:36 -04:00
ed	42071bd4f4	remove requirements.txt	2026-06-07 17:43:48 -04:00
ed	e7bfb94c05	fix(gui_2): coerce None → "" for input_text value in render_context_presets sloppy.py crashed in render_context_presets at line 3469 with TypeError: input_text(): incompatible function arguments. The second arg getattr(app, "ui_new_context_preset_name", "") returned None because the attribute EXISTS but is None — the default "" only fires for missing attributes. The App's __setattr__ delegates to the AppController when the controller has the attribute. The controller's init can leave ui_new_context_preset_name as None (via setattr from a plugin or a config flush). The defensive getattr doesn't help in that case. Fix: append `or ""` to coerce None and empty-string to "" so imgui.input_text always gets a valid str. Verified by the previously-failing batched tests (test_command_palette_sim, test_auto_switch_sim, test_live_warmup_canaries_endpoint, test_conductor_api_hook_integration): all 12 now pass.	2026-06-07 17:12:31 -04:00
ed	8130ae34d4	fix(gui_2): initialize ui_synthesis_prompt/selected_takes to prevent crash sloppy.py crashed on startup at gui_2.py:4006 with TypeError: input_text_multiline(): incompatible function arguments. The second positional arg (app.ui_synthesis_prompt) was None when it should be str. Root cause: the defensive guards if not hasattr(app, 'ui_synthesis_prompt'): app.ui_synthesis_prompt = "" only fire if the attribute is MISSING — if it's set to None elsewhere (e.g. via setattr from a config flush, or a plugin side-effect), hasattr returns True and the value stays None. Fix in 3 places: 1. App.__init__: initialize ui_synthesis_prompt = "" and ui_synthesis_selected_takes = {} at construction time alongside related context state (line 456). 2. render_synthesis_panel (line ~4002): harden the guard to check isinstance(getattr(...), str) — fixes the same pattern at its first call site. 3. render_takes_panel (line ~4139): same hardening at the second call site. Verified by constructing App() in a fresh subprocess and inspecting the attributes (ui_synthesis_prompt == "" and ui_synthesis_selected_takes == {} both before and after init_state()). Manual smoke test: previously the app crashed before any window was visible; now it renders the first frame.	2026-06-07 17:07:40 -04:00
ed	864957e8e9	docs(agents): reference skip-marker policy from workflow.md Cross-link the new Skip-Marker Policy section in conductor/workflow.md into AGENTS.md's "Critical Anti-Patterns" list. The pattern is: agent hits a pre-existing failure, marks it skip, moves on; suite rots; user has to track down each one later. The full policy lives in workflow.md (with the 4-question review checklist). AGENTS.md gets a one-line pointer so the rule is at the top of every agent's context. Rule applies in-session: when the fix is reachable within ~30 min of investigation, FIX IT INSTEAD of skipping.	2026-06-07 16:59:37 -04:00
ed	c9c5535889	docs(workflow): add Skip-Marker Policy section Per 2026-06-07 user feedback during test_suite cleanup: "if the intent is to annotate a known failure, fine. But that known failure must be addressed with priority." New section between "Per-Task Decision Protocol" and "Documentation Refresh Protocol" makes the policy explicit: - Skip markers are DOCUMENTATION, not avoidance - They're useful for opt-in integration tests, unimplemented features, or feature-flag-gated code - They're NOT useful for pre-existing failures, "I don't understand this" issues, or racy tests the agent doesn't want to debug - When adding a marker, MUST document the underlying issue AND what the fix would be - When the fix is in-session reachable, FIX IT INSTEAD of skipping — limited context is not an excuse Includes a 4-question review checklist before adding a skip. References the existing AGENTS.md "Use skip markers as excuse to AVOID" rule so the two policies don't drift.	2026-06-07 16:57:54 -04:00
ed	ff523f7e6e	fix(test_api_generate_blocked_while_stale): sleep in monkeypatches to keep switch in-flight The test had a pre-existing race: it monkeypatched _rebuild_rag_index and _flush_to_project to no-ops, which made _do_project_switch complete synchronously inside the io_pool worker. By the time the test's _api_generate call ran is_project_stale() was already False (the worker had cleared _project_switch_in_progress), so the 409 contract was never exercised. Fix: replace the no-op lambdas with `lambda: time.sleep(0.5)`. This keeps the worker busy for 500ms, which is more than enough window for the test to call _api_generate and observe the stale flag. _wait_for_switch then drains the rest of the work. Also: removed the @pytest.mark.skip marker; the underlying issue is now fixed in the test. Verified: 9/9 in tests/test_project_switch_persona_preset.py pass (previously 8 passed + 1 skipped).	2026-06-07 16:56:05 -04:00
ed	91b34ae81e	fix(hooks): handle dict-key bracket notation in set_value / get_value The Hook API previously rejected key strings like 'show_windows["Project Settings"]' (and silently returned None on get). The test_live_gui_filedialog_regression test exercises exactly this pattern to open the Project Settings window via the Hook API; it was previously marked skip with "hook server doesn't handle the dict-key bracket-notation syntax". Fix in three small places: 1. src/app_controller.py:_handle_set_value If `item` is not in _settable_fields, try parsing it as `dict_name[<key>]` notation. If dict_name IS in _settable_fields and the current attr is a dict, set the inner key. 2. src/api_hooks.py:/api/gui/value (POST get_val) Mirror the parsing for the field-based get endpoint. 3. src/api_hook_client.py:ApiHookClient.get_value Mirror the parsing in the client so the dict-key syntax works through the state endpoint as well (which is what get_value actually calls by default). Test fix: - tests/test_live_gui_filedialog_regression.py: removed the @pytest.mark.skip marker; the underlying issue is now fixed. Verified: 1/1 test passes (previously skipped).	2026-06-07 16:49:51 -04:00
ed	8d58d7fc46	fix(warmup): defer _done_event.set() until after callbacks fire WarmupManager._record_success and _record_failure used to set self._done_event.set() inside the with self._lock: block, BEFORE calling the user-registered on_complete callbacks. This created a race: a test thread calling mgr.wait() could observe mgr.is_done() == True and proceed before the worker thread had finished firing the callbacks. The mgr.on_complete caller would then assert on state that the callback was supposed to mutate (e.g. test_warmup_on_complete_callback_fires' `received` list). Fix: move self._done_event.set() to AFTER the for cb in callbacks: loop in both _record_success and _record_failure. The done event is now set last, so wait() cannot return until all callbacks have completed (or raised, which is swallowed by the try/except). ALSO fix the previously-corrupted state of warmup.py (the result of a misused set_file_slice edit that left orphaned code with no def line for _record_failure). _record_failure is now a proper class method with the def line restored. ALSO fix tests/test_warmup.py: - test_warmup_on_complete_callback_fires: the test body was missing the pool/mgr setup. Added the missing lines. - test_warmup_done_event_set_after_all_complete: removed the racy `assert not mgr.is_done()` assertion that fires immediately after submit. On a fast machine, os/sys warmup completes in microseconds, so is_done() is already True by the time the assertion runs. The remaining assertion (`assert mgr.is_done()` after wait) still tests the semantic that the done event is set after completion. - Removed both `@pytest.mark.skip` markers; the underlying issues are now fixed in production code AND the tests. Verified: 10/10 tests in tests/test_warmup.py pass (previously 2 skipped, 2 failed).	2026-06-07 16:02:30 -04:00
ed	a36aad5051	fix(test_gui_events_v2 + app_controller): patch correct target; init _project_switch_* test_gui_events_v2::test_handle_generate_send_pushes_event was patches 'threading.Thread' but production code in src/app_controller.py:_handle_generate_send uses self._io_pool.submit_io(worker) (an AppController method, NOT a method on the ThreadPoolExecutor). The test never got to its assertions because the patched attribute was never called. Fix: update the test to patch `mock_gui.controller.submit_io` (the AppController method). The `with patch.object(...)` block replaces submit_io with a MagicMock; calling _handle_generate_send now runs the worker synchronously (extracted via mock_submit.call_args[0][0]). ALSO: initialize _project_switch_in_progress and _project_switch_pending_path in AppController.__init__. They were previously set only inside _switch_project and _do_project_switch, so a fresh AppController() didn't have them and is_project_stale() would raise AttributeError. is_project_stale is also now getattr-based (defaulting to False) for additional safety. ALSO: remove the @pytest.mark.skip marker from the test since the underlying issue is now fixed. Verified: tests/test_gui_events_v2.py 3/3 pass (previously 1 skipped).	2026-06-07 15:38:11 -04:00
ed	0db5ec3eef	conductor(tracks): mark License CVE Audit track as complete Phase 4 verification complete: 4 atomic commits landed, 28 unit + integration tests passing, the audit script runs end-to-end against the post-cleanup repo, --strict mode + baseline file wired in as the CI gate. The 3 existing audit scripts are now joined by a 4th: scripts/audit_license_cve.py. Scope: third-party deps only. The project's own LICENSE file and SPDX headers are explicitly NOT touched (the user reserves all rights to the repo; no LICENSE file is created by this track). The audit reports third-party state only; it does not assert or imply a project license. Commits: `a8ae11d3` - chore(audit): add license_cve audit script + initial report `20fa3558` - chore(deps): tilde-pin all deps; delete requirements.txt `a7ab994f` - chore(audit): add --strict mode + baseline file (CI gate) (this) - conductor(tracks): mark track complete	2026-06-07 15:28:25 -04:00
ed	a7ab994f30	chore(audit): add --strict mode + baseline file (CI gate) scripts/audit_license_cve.baseline.json: the current violation set (post-cleanup) accepted as the gate baseline. When --strict is set, the script exits non-zero if the current violation count exceeds the baseline count. To regenerate the baseline after an intentional change (e.g., adding a new dep with an acceptable license), run: uv run python -m scripts.audit_license_cve --dump-baseline Also fixes the baseline path: it now lives next to the script (Path(__file__).parent) instead of the wrong location under docs/reports/scripts/. The script's --report-dir argument is unaffected - the baseline lives at scripts/audit_license_cve.baseline.json regardless of the report directory. The gate is wired into the same script (no separate file); mirrors the 3 existing audit scripts (audit_main_thread_imports, audit_weak_types, check_test_toml_paths) and their --strict pattern. 28 unit + integration tests passing.	2026-06-07 15:24:57 -04:00

1 2 3 4 5 ...

2754 Commits