undo_redo_lifecycle_fix_20260605 — Design

Date: 2026-06-05 Status: Draft Track: undo_redo_lifecycle_fix_20260605 (sub-project of v2)

Problem Statement

tests/test_undo_redo_sim.py::test_undo_redo_lifecycle failed in the 2026-06-05 second batched test run (after the first run had it passing). The test:

Sets temperature=0.5 and ai_input="Initial Input".
Modifies to temperature=1.5 and ai_input="Modified Input".
Asserts current state — passes.
Clicks btn_undo.
Asserts ai_input == "Initial Input" and temperature == 0.5.
Fails on the ai_input assertion: gets '' (empty string).

The undo restores temperature correctly but not ai_input. The other 2 tests in the same file (test_undo_redo_discussion_mutation, test_undo_redo_context_mutation) pass — they don't exercise ai_input.

Possible causes

App/Controller state sync bug for ai_input. The snapshot at src/gui_2.py:551 reads self.ui_ai_input (App), but set_value writes to controller.ui_ai_input. The snapshot captures the App's (stale) value. This should be fixed by the live_gui_state_sync_20260605 track (which adds an ui_ai_input property on the App that delegates to the Controller).
Snapshot doesn't include ai_input field at all. Check src/history.py:UISnapshot — if ai_input isn't a field, the snapshot stores nothing, and the apply can't restore.
Test flake. The test was passing in the first run, failing in the second. The live_gui fixture is session-scoped, and different test orders can produce different state. The test's time.sleep(2.0) after btn_undo may not be enough if the GUI is under load.
Recent user commit 873edf42 regression. The user's cleanup commit touched 53 files including src/gui_2.py:261 lines. If the cleanup accidentally changed the snapshot mechanism, this could break the test.

Design

Approach: Two-phase investigation

Phase 1: Re-run the test after the live_gui_state_sync_20260605 track lands.

If the state-sync property fix for ui_ai_input unblocks the test, the issue is resolved. No further work needed.

Phase 2: If the test still fails, deep-dive into the snapshot mechanism.

Investigate in this order:

Check src/history.py:UISnapshot to see if ai_input is a field. If not, add it.
Check src/gui_2.py:_apply_snapshot to see if it restores ai_input. If not, add the restore line.
Check if there's a per-tick snapshot filter that excludes certain fields.
Add a regression test that explicitly verifies the snapshot/undo round-trip for ai_input.

Phase 3: If still failing, test-ordering / flake investigation.

The test uses time.sleep(2.0) after btn_undo. Convert to polling (wait_for_load_completion from the wait_for_ready_test_pattern_20260605 track). If the test passes with polling, it was a flake.

Why this approach

Sequential investigation: cheapest fixes first. State-sync is the most likely cause (it just landed as a property fix). Snapshot mechanism is the second most likely. Flake is the third.
No speculative changes: don't add ai_input to the snapshot if it's already there. Don't change the undo mechanism if the state-sync fix is sufficient.

File Changes

Phase 1: None (state-sync fix is in a different track)

Phase 2 (if needed):

Modify: src/history.py (add ai_input field to UISnapshot if missing)
Modify: src/gui_2.py:_apply_snapshot (add ai_input restore line if missing)
Add: tests/test_undo_redo_ai_input_snapshot.py (regression test for the round-trip)

Phase 3 (if needed):

Modify: tests/test_undo_redo_sim.py (replace time.sleep(2.0) with wait_for_load_completion)

Risk Assessment

Risk	Likelihood	Impact	Mitigation
Phase 1 fixes the issue	High	None	Done
Phase 2 needed: snapshot already has ai_input but apply doesn't restore	Medium	Low	Check first, then add the restore line
Phase 2 needed: snapshot doesn't have ai_input	Low	Low	Add the field + apply line
Phase 3 needed: it's a flake	Low	None	Replace sleeps with polling

Out of Scope

State sync fix — separate track (live_gui_state_sync_20260605).
prior_session test — separate track (prior_session_test_harden_20260605).
wait_for_ready pattern — separate track (wait_for_ready_test_pattern_20260605).
General undo/redo system improvements — out of scope.

4.4 KiB Raw Permalink Blame History