Private
Public Access
0
0
Files
manual_slop/docs/superpowers/specs/2026-06-05-undo-redo-lifecycle-fix-design.md

4.4 KiB

undo_redo_lifecycle_fix_20260605 — Design

Date: 2026-06-05 Status: Draft Track: undo_redo_lifecycle_fix_20260605 (sub-project of v2)

Problem Statement

tests/test_undo_redo_sim.py::test_undo_redo_lifecycle failed in the 2026-06-05 second batched test run (after the first run had it passing). The test:

  1. Sets temperature=0.5 and ai_input="Initial Input".
  2. Modifies to temperature=1.5 and ai_input="Modified Input".
  3. Asserts current state — passes.
  4. Clicks btn_undo.
  5. Asserts ai_input == "Initial Input" and temperature == 0.5.
  6. Fails on the ai_input assertion: gets '' (empty string).

The undo restores temperature correctly but not ai_input. The other 2 tests in the same file (test_undo_redo_discussion_mutation, test_undo_redo_context_mutation) pass — they don't exercise ai_input.

Possible causes

  1. App/Controller state sync bug for ai_input. The snapshot at src/gui_2.py:551 reads self.ui_ai_input (App), but set_value writes to controller.ui_ai_input. The snapshot captures the App's (stale) value. This should be fixed by the live_gui_state_sync_20260605 track (which adds an ui_ai_input property on the App that delegates to the Controller).

  2. Snapshot doesn't include ai_input field at all. Check src/history.py:UISnapshot — if ai_input isn't a field, the snapshot stores nothing, and the apply can't restore.

  3. Test flake. The test was passing in the first run, failing in the second. The live_gui fixture is session-scoped, and different test orders can produce different state. The test's time.sleep(2.0) after btn_undo may not be enough if the GUI is under load.

  4. Recent user commit 873edf42 regression. The user's cleanup commit touched 53 files including src/gui_2.py:261 lines. If the cleanup accidentally changed the snapshot mechanism, this could break the test.

Design

Approach: Two-phase investigation

Phase 1: Re-run the test after the live_gui_state_sync_20260605 track lands.

If the state-sync property fix for ui_ai_input unblocks the test, the issue is resolved. No further work needed.

Phase 2: If the test still fails, deep-dive into the snapshot mechanism.

Investigate in this order:

  1. Check src/history.py:UISnapshot to see if ai_input is a field. If not, add it.
  2. Check src/gui_2.py:_apply_snapshot to see if it restores ai_input. If not, add the restore line.
  3. Check if there's a per-tick snapshot filter that excludes certain fields.
  4. Add a regression test that explicitly verifies the snapshot/undo round-trip for ai_input.

Phase 3: If still failing, test-ordering / flake investigation.

The test uses time.sleep(2.0) after btn_undo. Convert to polling (wait_for_load_completion from the wait_for_ready_test_pattern_20260605 track). If the test passes with polling, it was a flake.

Why this approach

  • Sequential investigation: cheapest fixes first. State-sync is the most likely cause (it just landed as a property fix). Snapshot mechanism is the second most likely. Flake is the third.
  • No speculative changes: don't add ai_input to the snapshot if it's already there. Don't change the undo mechanism if the state-sync fix is sufficient.

File Changes

Phase 1: None (state-sync fix is in a different track)

Phase 2 (if needed):

  • Modify: src/history.py (add ai_input field to UISnapshot if missing)
  • Modify: src/gui_2.py:_apply_snapshot (add ai_input restore line if missing)
  • Add: tests/test_undo_redo_ai_input_snapshot.py (regression test for the round-trip)

Phase 3 (if needed):

  • Modify: tests/test_undo_redo_sim.py (replace time.sleep(2.0) with wait_for_load_completion)

Risk Assessment

Risk Likelihood Impact Mitigation
Phase 1 fixes the issue High None Done
Phase 2 needed: snapshot already has ai_input but apply doesn't restore Medium Low Check first, then add the restore line
Phase 2 needed: snapshot doesn't have ai_input Low Low Add the field + apply line
Phase 3 needed: it's a flake Low None Replace sleeps with polling

Out of Scope

  • State sync fix — separate track (live_gui_state_sync_20260605).
  • prior_session test — separate track (prior_session_test_harden_20260605).
  • wait_for_ready pattern — separate track (wait_for_ready_test_pattern_20260605).
  • General undo/redo system improvements — out of scope.