From 85f8f08f42ddd3665733be335786b93f47d3f724 Mon Sep 17 00:00:00 2001 From: Ed_ Date: Mon, 23 Feb 2026 20:10:22 -0500 Subject: [PATCH] chore(conductor): Archive track 'live_ux_test_20260223' --- .../archive/live_ux_test_20260223/index.md | 5 +++ .../live_ux_test_20260223/metadata.json | 8 ++++ .../archive/live_ux_test_20260223/plan.md | 40 +++++++++++++++++++ .../archive/live_ux_test_20260223/spec.md | 37 +++++++++++++++++ conductor/tracks.md | 5 --- 5 files changed, 90 insertions(+), 5 deletions(-) create mode 100644 conductor/archive/live_ux_test_20260223/index.md create mode 100644 conductor/archive/live_ux_test_20260223/metadata.json create mode 100644 conductor/archive/live_ux_test_20260223/plan.md create mode 100644 conductor/archive/live_ux_test_20260223/spec.md diff --git a/conductor/archive/live_ux_test_20260223/index.md b/conductor/archive/live_ux_test_20260223/index.md new file mode 100644 index 0000000..f01ee93 --- /dev/null +++ b/conductor/archive/live_ux_test_20260223/index.md @@ -0,0 +1,5 @@ +# Track live_ux_test_20260223 Context + +- [Specification](./spec.md) +- [Implementation Plan](./plan.md) +- [Metadata](./metadata.json) diff --git a/conductor/archive/live_ux_test_20260223/metadata.json b/conductor/archive/live_ux_test_20260223/metadata.json new file mode 100644 index 0000000..7fd40d7 --- /dev/null +++ b/conductor/archive/live_ux_test_20260223/metadata.json @@ -0,0 +1,8 @@ +{ + "track_id": "live_ux_test_20260223", + "type": "feature", + "status": "new", + "created_at": "2026-02-23T19:14:00Z", + "updated_at": "2026-02-23T19:14:00Z", + "description": "Make a human-like test ux interaction where the AI creates a small python project, engages in a 5-turn discussion, and verifies history/session management features via API hooks." +} \ No newline at end of file diff --git a/conductor/archive/live_ux_test_20260223/plan.md b/conductor/archive/live_ux_test_20260223/plan.md new file mode 100644 index 0000000..55bca8a --- /dev/null +++ b/conductor/archive/live_ux_test_20260223/plan.md @@ -0,0 +1,40 @@ +# Implementation Plan: Human-Like UX Interaction Test + +## Phase 1: Infrastructure & Automation Core [checkpoint: 7626531] +Establish the foundation for driving the GUI via API hooks and simulation logic. + +- [x] Task: Extend `ApiHookClient` with methods for tab switching and listbox selection if missing. f36d539 +- [x] Task: Implement `TestUserAgent` class to manage dynamic response generation and action delays. d326242 +- [x] Task: Write Tests (Verify basic hook connectivity and simulated delays) f36d539 +- [x] Task: Implement basic 'ping-pong' interaction via hooks. bfe9ef0 +- [x] Task: Harden API hook thread-safety and simplify GUI state polling. 8bd280e +- [x] Task: Conductor - User Manual Verification 'Phase 1: Infrastructure & Automation Core' (Protocol in workflow.md) 7626531 + +## Phase 2: Workflow Simulation [checkpoint: 9c4a72c] +Build the core interaction loop for project creation and AI discussion. + +- [x] Task: Implement 'New Project' scaffolding script (creating a tiny console program). bd5dc16 +- [x] Task: Implement 5-turn discussion loop logic with sub-agent responses. bd5dc16 +- [x] Task: Write Tests (Verify state changes in Discussion Hub during simulated chat) 6d16438 +- [x] Task: Implement 'Thinking' and 'Live' indicator verification logic. 6d16438 +- [x] Task: Conductor - User Manual Verification 'Phase 2: Workflow Simulation' (Protocol in workflow.md) 9c4a72c + +## Phase 3: History & Session Verification [checkpoint: 0f04e06] +Simulate complex session management and historical audit features. + +- [x] Task: Implement discussion switching logic (creating/switching between named discussions). 5e1b965 +- [x] Task: Implement 'Load Prior Log' simulation and 'Tinted Mode' detection. 5e1b965 +- [x] Task: Write Tests (Verify log loading and tab navigation consistency) 5e1b965 +- [x] Task: Implement truncation limit verification (forcing a long history and checking bleed). 5e1b965 +- [x] Task: Conductor - User Manual Verification 'Phase 3: History & Session Verification' (Protocol in workflow.md) 0f04e06 + +## Phase 4: Final Integration & Regression [checkpoint: 8e63b31] +Consolidate the simulation into end-user artifacts and CI tests. + +- [x] Task: Create `live_walkthrough.py` with full visual feedback and manual sign-off. 8bd280e +- [x] Task: Create `tests/test_live_workflow.py` for automated regression testing. 8bd280e +- [x] Task: Perform a full visual walkthrough and verify 'human-readable' pace. 8e63b31 +- [x] Task: Conductor - User Manual Verification 'Phase 4: Final Integration & Regression' (Protocol in workflow.md) 8e63b31 + +## Phase: Review Fixes +- [x] Task: Apply review suggestions 064d7ba diff --git a/conductor/archive/live_ux_test_20260223/spec.md b/conductor/archive/live_ux_test_20260223/spec.md new file mode 100644 index 0000000..e4a7a96 --- /dev/null +++ b/conductor/archive/live_ux_test_20260223/spec.md @@ -0,0 +1,37 @@ +# Specification: Human-Like UX Interaction Test + +## Overview +This track implements a robust, "human-like" interaction test suite for Manual Slop. The suite will simulate a real user's workflow—from project creation to complex AI discussions and history management—using the application's API hooks. It aims to verify the "Integrated Workspace" functionality, tool execution, and history persistence without requiring manual human input, while remaining slow enough for visual audit. + +## Scope +- **Standalone Interactive Test**: A Python script (`live_walkthrough.py`) that drives the GUI through a full session, ending with an optional manual sign-off. +- **Automated Regression Test**: A pytest integration (`tests/test_live_workflow.py`) that executes the same logic in a headless or automated fashion for CI. +- **Target Model**: Google Gemini Flash 2.5. + +## Functional Requirements +1. **User Simulation**: + - **Dynamic Messaging**: The test agent will generate responses based on the AI's output to simulate a multi-turn conversation. + - **Tactile Delays**: Short, random delays (minimum 0.5s) between actions to simulate reading and "typing" time. + - **Visual Feedback**: Automatic scrolling of the discussion history and comms logs to keep the "live" action in view. +2. **Workflow Scenarios**: + - **Project Scaffolding**: Create a new project and initialize a tiny console-based Python program. + - **Discussion Loop**: Engage in a ~5-turn conversation with the AI to refine the code. + - **Context Management**: Verify that tool calls (filesystem, shell) are reflected correctly in the Comms and Tool Log tabs. + - **History Depth**: Verify truncation limits and switching between named discussions. +3. **Session Management**: + - **Tab Interaction**: Programmatically switch between "Comms Log" and "Tool Log" tabs during operations. + - **Historical Audit**: Use the "Load Session Log" feature to load a prior log file and verify "Tinted Mode" visibility. + +## Non-Functional Requirements +- **Efficiency**: Minimize token usage by using Gemini Flash and keeping the "User" prompts concise. +- **Observability**: The standalone test must be clearly visible to a human observer, with state changes occurring at a "human-readable" pace. + +## Acceptance Criteria +- `live_walkthrough.py` successfully completes a 5-turn discussion and signs off. +- `tests/test_live_workflow.py` passes in CI environment. +- Prior session logs are loaded and visualized without crashing. +- Thinking and Live indicators trigger correctly during simulated API calls. + +## Out of Scope +- Support for Anthropic API in this specific test track. +- Stress testing high-concurrency tool calls. diff --git a/conductor/tracks.md b/conductor/tracks.md index ef17fc0..762cb38 100644 --- a/conductor/tracks.md +++ b/conductor/tracks.md @@ -7,11 +7,6 @@ This file tracks all major tracks for the project. Each track has its own detail - [x] **Track: Implement context visualization and memory management improvements** *Link: [./tracks/context_management_20260223/](./tracks/context_management_20260223/)* ---- - -- [x] **Track: Make a human-like test ux interaction where the AI creates a small python project, engages in a 5-turn discussion, and verifies history/session management features via API hooks.** -*Link: [./tracks/live_ux_test_20260223/](./tracks/live_ux_test_20260223/)* -