Adds a one-shot `_diag_layout_state` method that runs in `_post_init`
and prints three lines to stderr:
1. `[GUI] show_windows entries: N, visible by default: M` — how many
windows are defined vs. visible with no layout file.
2. `[GUI] visible-by-default windows: ...` — the names of windows
that will appear on a fresh launch.
3. `[GUI] WARNING: layout has N stale window name(s) that no longer
exist: ...` — when the on-disk manualslop_layout.ini references
window names that the current code has dropped (Projects/Files/
Screenshots/Provider/Discussion History/etc. — all replaced by
the hub pattern in earlier refactors).
This addresses the user's observation that:
- "the diagnostics panel still only shows itself"
- "I see a flicker as if the layout got reset but cannot retain
permanence"
Both symptoms are caused by the repo-root manualslop_layout.ini
referencing pre-hub-refactor window names that HelloImGui silently
drops on load. The diagnostic surfaces the root cause in the test
log so the user can see exactly which stale names are present,
without having to manually diff the .ini file.
Verified: log appears in `logs/sloppy_py_test.log` on the next
live_gui test run, including the 11 default-visible windows and
the staleness check.
The repo-root manualslop_layout.ini references pre-hub-refactor
window names that no longer exist in the current code
(Projects/Files/Screenshots/Provider/System Prompts/etc.).
HelloImGui silently drops unknown windows when loading the
layout, causing "missing panels" in live_gui tests and in the
user's interactive session.
The previous "Preserve GUI layout for tests" block copied the
stale repo-root layout into the live_gui workspace, infecting
every live_gui test session with stale state.
Fix: skip the copy. HelloImui will generate a fresh layout in
the test workspace on shutdown, which then lives in the
session-scoped workspace and is cleaned up at teardown.
The repo-root manualslop_layout.ini is still TRACKED (I did
not delete it; that's the user's call). They can:
- Delete it manually, or
- Run the existing "Reset Layout" command from the Command Palette
(which deletes both repo-root and live_gui_workspace paths and
forces HelloImGui to regenerate with the current window catalog).
Verified: 6/6 targeted tests pass.
Four test files had patches/monkeypatches that referenced the
removed src.models.load_config or src.models.CONFIG_PATH module
constant. These all stem from the config I/O refactor (commit
7bcb5a8c) that renamed load_config/save_config to private I/O
primitives.
- tests/test_external_editor_gui.py: 2 sites changed from
monkeypatch.setattr(models_module, 'load_config', ...) to
monkeypatch.setattr('src.app_controller.AppController.load_config', ...)
- tests/test_external_mcp_e2e.py: CONFIG_PATH monkeypatch changed
to SLOP_CONFIG env var (the only supported override path)
- tests/test_log_management_ui.py: same CONFIG_PATH -> SLOP_CONFIG fix
- tests/test_gen_send_empty_context.py: _StubController now receives
ui_selected_context_files and _pending_generation_action from the
app_instance BEFORE being assigned as controller (App.__getattr__
delegates to controller, so attrs must be on the stub first)
Also: deleted tests/artifacts/manualslop_layout.ini (gitignored
stale file from March 4 referencing pre-refactor window names like
"Projects"/"Files"/"Screenshots" that no longer exist in the code).
Repo-root manualslop_layout.ini still references the same old
window names; user should run the existing "Reset Layout" command
(or delete it manually) to regenerate with the current window
catalog (Context Hub / AI Settings Hub / Discussion Hub / etc.).
Verified: 13 targeted tests pass:
- test_external_editor_gui.py (5/5)
- test_external_mcp_e2e.py (1/1)
- test_log_management_ui.py (2/2)
- test_gen_send_empty_context.py (5/5)
Eliminates 22 call sites that bypassed the AppController state owner
and read/wrote config.toml directly. AppController is now the single
source of truth for self.config; gui_2.py, commands.py, etc. go
through controller.save_config() / controller.load_config().
Production changes:
- src/models.py: rename load_config -> _load_config_from_disk,
save_config -> _save_config_to_disk (private I/O primitives)
- src/app_controller.py: add public load_config()/save_config() methods
that own the state. Update 3 internal call sites and 3 ConductorEngine
call sites to pass max_workers from self.config
- src/multi_agent_conductor.py: ConductorEngine.__init__ now takes
max_workers as a parameter (caller responsibility, not I/O primitive)
- src/external_editor.py: get_default_launcher() takes config as a
parameter; gui_2.py:1311,4776 pass app.config
- src/gui_2.py: 17 sites of models.save_config(X.config) replaced with
X.save_config() (delegates via __getattr__ to controller)
- src/commands.py: save_all() uses app.save_config()
Test changes (route through controller, not I/O primitive):
- tests/conftest.py: mock_app and app_instance fixtures now patch
AppController.load_config/save_config instead of models I/O primitives
- 18 other test files: patches renamed from models._save_config_to_disk
to AppController.save_config (and same for load_config)
- tests/test_app_controller_mcp.py: use SLOP_CONFIG env var instead of
patching removed CONFIG_PATH module constant
- tests/test_parallel_execution.py: pass max_workers=2 explicitly to
ConductorEngine (caller no longer reads config)
- tests/test_gui_paths.py: add save_config=MagicMock() to MockApp;
assert on controller method, not I/O primitive
- tests/test_models_no_top_level_tomli_w.py: still calls private
_save_config_to_disk directly (the only allowed exception; tests
the lazy-load behavior of the primitive itself)
New files:
- scripts/audit_no_models_config_io.py: enforces the rule (--strict,
--json modes; AST-based docstring detection to avoid false positives)
- conductor/code_styleguides/config_state_owner.md: documents the rule
Verification:
- 67 targeted tests pass
- scripts/audit_no_models_config_io.py --strict returns 0
This is the architectural cleanup that surfaced during the
audit_architectural_cheats_20260607 review. Closes the smoke-gun
CONFIG_PATH module constant (already done in 0c7ebf22) AND the
free-function models.load_config/save_config smell.
[conductor(checkpoint): config-iO-refactor-20260607]
ROOT CAUSE: src/models.py had `CONFIG_PATH = get_config_path()`
at module level. Every test that imported `src.models` and called
`save_config()` or `load_config()` wrote/read the repo-root
`config.toml` via this cached constant. The path was resolved
once at import time, so the SLOP_CONFIG env var (or test
fixtures) couldn't redirect reads/writes without reimporting the
module.
This silently corrupted the user's config.toml on every test
run. The diff between runs showed: 'config.toml changed in
working copy' — caused by tests, not the user.
FIX: remove the module-level constant; call get_config_path()
on every read/write call. SLOP_CONFIG (and any test-time
set_config_path() helper) now works without reimport.
Also: keep my prior commits to this file (reset_layout command
in src/commands.py; the RUN_MMA_INTEGRATION skipif in
test_mma_step_mode_sim.py) bundled here for a clean atomic
fix-pack since the user just fixed the indentation issue I had.
Verified: src.models imports cleanly; load_config/save_config
work as expected. Tests that import these functions will
use whatever SLOP_CONFIG points to (or the repo-root default).
sloppy.py crashed in render_context_presets at line 3469 with
TypeError: input_text(): incompatible function arguments.
The second arg getattr(app, "ui_new_context_preset_name", "")
returned None because the attribute EXISTS but is None — the
default "" only fires for missing attributes.
The App's __setattr__ delegates to the AppController when the
controller has the attribute. The controller's init can leave
ui_new_context_preset_name as None (via setattr from a plugin
or a config flush). The defensive getattr doesn't help in that
case.
Fix: append `or ""` to coerce None and empty-string to "" so
imgui.input_text always gets a valid str.
Verified by the previously-failing batched tests (test_command_palette_sim, test_auto_switch_sim, test_live_warmup_canaries_endpoint, test_conductor_api_hook_integration): all 12 now pass.
sloppy.py crashed on startup at gui_2.py:4006 with
TypeError: input_text_multiline(): incompatible function arguments.
The second positional arg (app.ui_synthesis_prompt) was None
when it should be str.
Root cause: the defensive guards
if not hasattr(app, 'ui_synthesis_prompt'):
app.ui_synthesis_prompt = ""
only fire if the attribute is MISSING — if it's set to None
elsewhere (e.g. via setattr from a config flush, or a plugin
side-effect), hasattr returns True and the value stays None.
Fix in 3 places:
1. App.__init__: initialize ui_synthesis_prompt = "" and
ui_synthesis_selected_takes = {} at construction time
alongside related context state (line 456).
2. render_synthesis_panel (line ~4002): harden the guard to
check isinstance(getattr(...), str) — fixes the same
pattern at its first call site.
3. render_takes_panel (line ~4139): same hardening at the
second call site.
Verified by constructing App() in a fresh subprocess and
inspecting the attributes (ui_synthesis_prompt == "" and
ui_synthesis_selected_takes == {} both before and after
init_state()).
Manual smoke test: previously the app crashed before any
window was visible; now it renders the first frame.
Cross-link the new Skip-Marker Policy section in
conductor/workflow.md into AGENTS.md's "Critical Anti-Patterns"
list. The pattern is: agent hits a pre-existing failure, marks
it skip, moves on; suite rots; user has to track down each one
later. The full policy lives in workflow.md (with the 4-question
review checklist). AGENTS.md gets a one-line pointer so the
rule is at the top of every agent's context.
Rule applies in-session: when the fix is reachable within
~30 min of investigation, FIX IT INSTEAD of skipping.
Per 2026-06-07 user feedback during test_suite cleanup:
"if the intent is to annotate a known failure, fine. But that
known failure must be addressed with priority."
New section between "Per-Task Decision Protocol" and
"Documentation Refresh Protocol" makes the policy explicit:
- Skip markers are DOCUMENTATION, not avoidance
- They're useful for opt-in integration tests, unimplemented
features, or feature-flag-gated code
- They're NOT useful for pre-existing failures, "I don't
understand this" issues, or racy tests the agent doesn't want
to debug
- When adding a marker, MUST document the underlying issue AND
what the fix would be
- When the fix is in-session reachable, FIX IT INSTEAD of
skipping — limited context is not an excuse
Includes a 4-question review checklist before adding a skip.
References the existing AGENTS.md "Use skip markers as excuse to
AVOID" rule so the two policies don't drift.
The test had a pre-existing race: it monkeypatched
_rebuild_rag_index and _flush_to_project to no-ops, which made
_do_project_switch complete synchronously inside the io_pool
worker. By the time the test's _api_generate call ran
is_project_stale() was already False (the worker had cleared
_project_switch_in_progress), so the 409 contract was never
exercised.
Fix: replace the no-op lambdas with `lambda: time.sleep(0.5)`.
This keeps the worker busy for 500ms, which is more than enough
window for the test to call _api_generate and observe the
stale flag. _wait_for_switch then drains the rest of the work.
Also: removed the @pytest.mark.skip marker; the underlying issue
is now fixed in the test.
Verified: 9/9 in tests/test_project_switch_persona_preset.py pass
(previously 8 passed + 1 skipped).
The Hook API previously rejected key strings like
'show_windows["Project Settings"]' (and silently returned None on
get). The test_live_gui_filedialog_regression test exercises exactly
this pattern to open the Project Settings window via the Hook API;
it was previously marked skip with "hook server doesn't handle the
dict-key bracket-notation syntax".
Fix in three small places:
1. src/app_controller.py:_handle_set_value
If `item` is not in _settable_fields, try parsing it as
`dict_name[<key>]` notation. If dict_name IS in _settable_fields
and the current attr is a dict, set the inner key.
2. src/api_hooks.py:/api/gui/value (POST get_val)
Mirror the parsing for the field-based get endpoint.
3. src/api_hook_client.py:ApiHookClient.get_value
Mirror the parsing in the client so the dict-key syntax works
through the state endpoint as well (which is what get_value
actually calls by default).
Test fix:
- tests/test_live_gui_filedialog_regression.py: removed the
@pytest.mark.skip marker; the underlying issue is now fixed.
Verified: 1/1 test passes (previously skipped).
WarmupManager._record_success and _record_failure used to set
self._done_event.set() inside the with self._lock: block, BEFORE
calling the user-registered on_complete callbacks. This created
a race: a test thread calling mgr.wait() could observe
mgr.is_done() == True and proceed before the worker thread had
finished firing the callbacks. The mgr.on_complete caller would
then assert on state that the callback was supposed to mutate
(e.g. test_warmup_on_complete_callback_fires' `received` list).
Fix: move self._done_event.set() to AFTER the for cb in callbacks:
loop in both _record_success and _record_failure. The done event
is now set last, so wait() cannot return until all callbacks
have completed (or raised, which is swallowed by the try/except).
ALSO fix the previously-corrupted state of warmup.py (the result
of a misused set_file_slice edit that left orphaned code with no
def line for _record_failure). _record_failure is now a proper
class method with the def line restored.
ALSO fix tests/test_warmup.py:
- test_warmup_on_complete_callback_fires: the test body was
missing the pool/mgr setup. Added the missing lines.
- test_warmup_done_event_set_after_all_complete: removed the
racy `assert not mgr.is_done()` assertion that fires
immediately after submit. On a fast machine, os/sys warmup
completes in microseconds, so is_done() is already True
by the time the assertion runs. The remaining assertion
(`assert mgr.is_done()` after wait) still tests the
semantic that the done event is set after completion.
- Removed both `@pytest.mark.skip` markers; the underlying
issues are now fixed in production code AND the tests.
Verified: 10/10 tests in tests/test_warmup.py pass (previously
2 skipped, 2 failed).
test_gui_events_v2::test_handle_generate_send_pushes_event was
patches 'threading.Thread' but production code in
src/app_controller.py:_handle_generate_send uses
self._io_pool.submit_io(worker) (an AppController method, NOT a
method on the ThreadPoolExecutor). The test never got to its
assertions because the patched attribute was never called.
Fix: update the test to patch `mock_gui.controller.submit_io`
(the AppController method). The `with patch.object(...)` block
replaces submit_io with a MagicMock; calling _handle_generate_send
now runs the worker synchronously (extracted via
mock_submit.call_args[0][0]).
ALSO: initialize _project_switch_in_progress and
_project_switch_pending_path in AppController.__init__. They were
previously set only inside _switch_project and _do_project_switch,
so a fresh AppController() didn't have them and is_project_stale()
would raise AttributeError. is_project_stale is also now
getattr-based (defaulting to False) for additional safety.
ALSO: remove the @pytest.mark.skip marker from the test since
the underlying issue is now fixed.
Verified: tests/test_gui_events_v2.py 3/3 pass (previously 1 skipped).
Phase 4 verification complete: 4 atomic commits landed, 28
unit + integration tests passing, the audit script runs
end-to-end against the post-cleanup repo, --strict mode
+ baseline file wired in as the CI gate. The 3 existing
audit scripts are now joined by a 4th: scripts/audit_license_cve.py.
Scope: third-party deps only. The project's own LICENSE
file and SPDX headers are explicitly NOT touched (the user
reserves all rights to the repo; no LICENSE file is
created by this track). The audit reports third-party state
only; it does not assert or imply a project license.
Commits:
a8ae11d3 - chore(audit): add license_cve audit script + initial report
20fa3558 - chore(deps): tilde-pin all deps; delete requirements.txt
a7ab994f - chore(audit): add --strict mode + baseline file (CI gate)
(this) - conductor(tracks): mark track complete
scripts/audit_license_cve.baseline.json: the current
violation set (post-cleanup) accepted as the gate baseline.
When --strict is set, the script exits non-zero if the
current violation count exceeds the baseline count.
To regenerate the baseline after an intentional change
(e.g., adding a new dep with an acceptable license), run:
uv run python -m scripts.audit_license_cve --dump-baseline
Also fixes the baseline path: it now lives next to the script
(Path(__file__).parent) instead of the wrong location under
docs/reports/scripts/. The script's --report-dir argument is
unaffected - the baseline lives at scripts/audit_license_cve.baseline.json
regardless of the report directory.
The gate is wired into the same script (no separate file);
mirrors the 3 existing audit scripts (audit_main_thread_imports,
audit_weak_types, check_test_toml_paths) and their --strict
pattern.
28 unit + integration tests passing.