docs: Reorder track queue and initialize final stabilization tracks
- Initialize asyncio_decoupling_refactor_20260306 track - Initialize mock_provider_hardening_20260305 track - Initialize simulation_fidelity_enhancement_20260305 track - Update TASKS.md and tracks.md to reflect new strict execution queue - Archive completed tracks and remove deprecated test performance track
This commit is contained in:
@@ -8,34 +8,43 @@ This file tracks all major tracks for the project. Each track has its own detail
|
||||
|
||||
*The following tracks MUST be executed in this exact order to safely resolve tech debt before feature development.*
|
||||
|
||||
1. [x] **Track: Codebase Migration to `src` & Cleanup**
|
||||
*Link: [./tracks/codebase_migration_20260302/](./tracks/codebase_migration_20260302/)*
|
||||
|
||||
2. [x] **Track: GUI Decoupling & Controller Architecture**
|
||||
*Link: [./tracks/gui_decoupling_controller_20260302/](./tracks/gui_decoupling_controller_20260302/)*
|
||||
|
||||
3. [ ] **Track: Hook API UI State Verification**
|
||||
1. [ ] **Track: Hook API UI State Verification**
|
||||
*Link: [./tracks/hook_api_ui_state_verification_20260302/](./tracks/hook_api_ui_state_verification_20260302/)*
|
||||
|
||||
2. [ ] **Track: Asyncio Decoupling & Queue Refactor**
|
||||
*Link: [./tracks/asyncio_decoupling_refactor_20260306/](./tracks/asyncio_decoupling_refactor_20260306/)*
|
||||
|
||||
3. [ ] **Track: Mock Provider Hardening**
|
||||
*Link: [./tracks/mock_provider_hardening_20260305/](./tracks/mock_provider_hardening_20260305/)*
|
||||
|
||||
4. [ ] **Track: Robust JSON Parsing for Tech Lead**
|
||||
*Link: [./tracks/robust_json_parsing_tech_lead_20260302/](./tracks/robust_json_parsing_tech_lead_20260302/)*
|
||||
|
||||
5. [ ] **Track: Concurrent Tier Source Isolation**
|
||||
*Link: [./tracks/concurrent_tier_source_tier_20260302/](./tracks/concurrent_tier_source_tier_20260302/)*
|
||||
|
||||
6. [ ] **Track: Test Suite Performance & Flakiness**
|
||||
*Link: [./tracks/test_suite_performance_and_flakiness_20260302/](./tracks/test_suite_performance_and_flakiness_20260302/)*
|
||||
|
||||
7. [ ] **Track: Manual UX Validation & Polish**
|
||||
6. [ ] **Track: Manual UX Validation & Polish**
|
||||
*Link: [./tracks/manual_ux_validation_20260302/](./tracks/manual_ux_validation_20260302/)*
|
||||
|
||||
8. [ ] **Track: Asynchronous Tool Execution Engine**
|
||||
7. [ ] **Track: Asynchronous Tool Execution Engine**
|
||||
*Link: [./tracks/async_tool_execution_20260303/](./tracks/async_tool_execution_20260303/)*
|
||||
|
||||
8. [ ] **Track: Simulation Fidelity Enhancement**
|
||||
*Link: [./tracks/simulation_fidelity_enhancement_20260305/](./tracks/simulation_fidelity_enhancement_20260305/)*
|
||||
|
||||
---
|
||||
|
||||
## Completed / Archived
|
||||
|
||||
- [x] **Track: Test Architecture Integrity Audit**
|
||||
*Link: [./archive/test_architecture_integrity_audit_20260304/](./archive/test_architecture_integrity_audit_20260304/)*
|
||||
|
||||
- [x] **Track: Codebase Migration to `src` & Cleanup**
|
||||
*Link: [./archive/codebase_migration_20260302/](./archive/codebase_migration_20260302/)*
|
||||
|
||||
- [x] **Track: GUI Decoupling & Controller Architecture**
|
||||
*Link: [./archive/gui_decoupling_controller_20260302/](./archive/gui_decoupling_controller_20260302/)*
|
||||
|
||||
- [x] **Track: Strict Static Analysis & Type Safety**
|
||||
*Link: [./archive/strict_static_analysis_and_typing_20260302/](./archive/strict_static_analysis_and_typing_20260302/)*
|
||||
|
||||
@@ -73,4 +82,4 @@ This file tracks all major tracks for the project. Each track has its own detail
|
||||
*Link: [./archive/documentation_refresh_20260224/](./archive/documentation_refresh_20260224/)*
|
||||
|
||||
- [x] **Track: Robust Live Simulation Verification**
|
||||
*Link: [./archive/robust_live_simulation_verification/](./archive/robust_live_simulation_verification/)*
|
||||
*Link: [./archive/robust_live_simulation_verification/](./archive/robust_live_simulation_verification/)*
|
||||
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"id": "asyncio_decoupling_refactor_20260306",
|
||||
"title": "Asyncio Decoupling & Queue Refactor",
|
||||
"description": "Rip out asyncio from AppController to eliminate test deadlocks.",
|
||||
"status": "planned",
|
||||
"created_at": "2026-03-05T00:00:00Z",
|
||||
"updated_at": "2026-03-05T00:00:00Z"
|
||||
}
|
||||
@@ -0,0 +1,33 @@
|
||||
# Implementation Plan: Asyncio Decoupling Refactor (asyncio_decoupling_refactor_20260306)
|
||||
|
||||
> **TEST DEBT FIX:** This track is responsible for permanently eliminating the `RuntimeError: Event loop is closed` test suite crashes by ripping out the conflict-prone asyncio loops from the AppController.
|
||||
|
||||
## Phase 1: Event System Migration
|
||||
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
|
||||
- [ ] Task: Refactor `events.py`
|
||||
- [ ] WHERE: `src/events.py`
|
||||
- [ ] WHAT: Replace `AsyncEventQueue` with `SyncEventQueue` using `import queue`.
|
||||
- [ ] HOW: Change `async def get()` to a blocking `def get()`. Remove `asyncio` imports.
|
||||
- [ ] SAFETY: Ensure thread-safety.
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 1: Event System'
|
||||
|
||||
## Phase 2: AppController Decoupling
|
||||
- [ ] Task: Refactor `AppController` Event Loop
|
||||
- [ ] WHERE: `src/app_controller.py`
|
||||
- [ ] WHAT: Remove `self._loop` and `asyncio.new_event_loop()`.
|
||||
- [ ] HOW: Change `_run_event_loop` to just call `_process_event_queue` directly (which will now block on queue gets).
|
||||
- [ ] SAFETY: Ensure `shutdown()` properly signals the queue to unblock and join the thread.
|
||||
- [ ] Task: Thread Task Dispatching
|
||||
- [ ] WHERE: `src/app_controller.py`
|
||||
- [ ] WHAT: Replace `asyncio.run_coroutine_threadsafe(self.event_queue.put(...))` with direct synchronous `.put()`. Replace `self._loop.run_in_executor` with `threading.Thread(target=self._handle_request_event)`.
|
||||
- [ ] HOW: Mechanical replacement of async primitives.
|
||||
- [ ] SAFETY: None.
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 2: Decoupling'
|
||||
|
||||
## Phase 3: Final Validation
|
||||
- [ ] Task: Full Suite Validation
|
||||
- [ ] WHERE: Project root
|
||||
- [ ] WHAT: `uv run pytest`
|
||||
- [ ] HOW: Ensure 100% pass rate with no hanging threads or event loop errors.
|
||||
- [ ] SAFETY: None.
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 3: Final Validation'
|
||||
@@ -0,0 +1,14 @@
|
||||
# Specification: Asyncio Decoupling & Refactor
|
||||
|
||||
## Background
|
||||
The `AppController` currently utilizes an internal `asyncio.Queue` and a dedicated `_loop_thread` to manage background tasks and GUI updates. As identified in the `test_architecture_integrity_audit_20260304`, this architecture leads to severe event loop exhaustion and `RuntimeError: Event loop is closed` deadlocks during full test suite runs due to conflicts with `pytest-asyncio`'s loop management.
|
||||
|
||||
## Objective
|
||||
Remove all `asyncio` dependencies from `AppController` and `events.py`. Replace the asynchronous event queue with a standard, thread-safe `queue.Queue` from Python's standard library.
|
||||
|
||||
## Requirements
|
||||
1. **Remove Asyncio:** Strip `import asyncio` from `app_controller.py` and `events.py`.
|
||||
2. **Synchronous Queues:** Convert `events.AsyncEventQueue` to a standard synchronous wrapper around `queue.Queue`.
|
||||
3. **Daemon Thread Processing:** Convert `AppController._process_event_queue` from an `async def` to a standard synchronous `def` that blocks on `self.event_queue.get()`.
|
||||
4. **Thread Offloading:** Use `threading.Thread` or `concurrent.futures.ThreadPoolExecutor` to handle AI request dispatching (instead of `self._loop.run_in_executor`).
|
||||
5. **No Regressions:** The application must remain responsive (60 FPS) and all unit/integration tests must pass cleanly.
|
||||
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"id": "mock_provider_hardening_20260305",
|
||||
"title": "Mock Provider Hardening",
|
||||
"description": "Introduce negative testing paths (malformed JSON, timeouts) into the mock AI provider.",
|
||||
"status": "planned",
|
||||
"created_at": "2026-03-05T00:00:00Z",
|
||||
"updated_at": "2026-03-05T00:00:00Z"
|
||||
}
|
||||
26
conductor/tracks/mock_provider_hardening_20260305/plan.md
Normal file
26
conductor/tracks/mock_provider_hardening_20260305/plan.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# Implementation Plan: Mock Provider Hardening (mock_provider_hardening_20260305)
|
||||
|
||||
## Phase 1: Mock Script Extension
|
||||
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
|
||||
- [ ] Task: Add `MOCK_MODE` to `mock_gemini_cli.py`
|
||||
- [ ] WHERE: `tests/mock_gemini_cli.py`
|
||||
- [ ] WHAT: Implement conditional branches based on `MOCK_MODE` environment variable.
|
||||
- [ ] HOW: Support `success`, `malformed_json`, `error_result`, and `timeout`.
|
||||
- [ ] SAFETY: Ensure it still defaults to `success` to not break existing tests.
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 1: Mock Extension'
|
||||
|
||||
## Phase 2: Negative Path Testing
|
||||
- [ ] Task: Write `test_negative_flows.py`
|
||||
- [ ] WHERE: `tests/test_negative_flows.py`
|
||||
- [ ] WHAT: Write tests that launch `live_gui`, inject `MOCK_MODE` via `ApiHookClient` custom callback or `env` dictionary, and assert the UI gracefully handles the failure.
|
||||
- [ ] HOW: Use `wait_for_event('response')` and check that the payload status is `"error"`.
|
||||
- [ ] SAFETY: Ensure `timeout` tests don't actually hang the test suite for 120s (configure the timeout shorter if possible in test setup).
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 2: Negative Tests'
|
||||
|
||||
## Phase 3: Final Validation
|
||||
- [ ] Task: Full Suite Validation
|
||||
- [ ] WHERE: Project root
|
||||
- [ ] WHAT: `uv run pytest`
|
||||
- [ ] HOW: Ensure 100% pass rate.
|
||||
- [ ] SAFETY: None.
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 3: Final Validation'
|
||||
14
conductor/tracks/mock_provider_hardening_20260305/spec.md
Normal file
14
conductor/tracks/mock_provider_hardening_20260305/spec.md
Normal file
@@ -0,0 +1,14 @@
|
||||
# Specification: Mock Provider Hardening
|
||||
|
||||
## Background
|
||||
The current `mock_gemini_cli.py` provider only tests the "happy path". It always returns successfully parsed JSON-L responses, which masks potential error-handling bugs in `ai_client.py` and `AppController`. To properly verify the system's robustness, the mock must be capable of failing realistically.
|
||||
|
||||
## Objective
|
||||
Extend `mock_gemini_cli.py` to support negative testing paths, controlled via an environment variable `MOCK_MODE`.
|
||||
|
||||
## Requirements
|
||||
1. **MOCK_MODE parsing:** The mock script must read `os.environ.get("MOCK_MODE", "success")`.
|
||||
2. **malformed_json:** If mode is `malformed_json`, the mock should print a truncated or syntactically invalid JSON string to `stdout` and exit.
|
||||
3. **error_result:** If mode is `error_result`, the mock should print a valid JSON string but with `"status": "error"` and an error message payload.
|
||||
4. **timeout:** If mode is `timeout`, the mock should `time.sleep(120)` to force the parent process to handle a subprocess timeout.
|
||||
5. **Integration Tests:** New tests must be written to explicitly trigger these modes using `ApiHookClient` and verify that the GUI displays an error state rather than crashing.
|
||||
@@ -0,0 +1,8 @@
|
||||
{
|
||||
"id": "simulation_fidelity_enhancement_20260305",
|
||||
"title": "Simulation Fidelity Enhancement",
|
||||
"description": "Add human-like jitter, hesitation, and reading latency to the UserSimAgent.",
|
||||
"status": "planned",
|
||||
"created_at": "2026-03-05T00:00:00Z",
|
||||
"updated_at": "2026-03-05T00:00:00Z"
|
||||
}
|
||||
@@ -0,0 +1,26 @@
|
||||
# Implementation Plan: Simulation Fidelity Enhancement (simulation_fidelity_enhancement_20260305)
|
||||
|
||||
## Phase 1: User Agent Modeling
|
||||
- [ ] Task: Initialize MMA Environment `activate_skill mma-orchestrator`
|
||||
- [ ] Task: Update `UserSimAgent`
|
||||
- [ ] WHERE: `simulation/user_agent.py`
|
||||
- [ ] WHAT: Add reading delay calculation (based on word count), typing jitter for input fields, and action hesitation probabilities.
|
||||
- [ ] HOW: Use Python's `random` module to introduce variance.
|
||||
- [ ] SAFETY: Ensure these delays are configurable so that fast test runs can disable them.
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 1: Agent Modeling'
|
||||
|
||||
## Phase 2: Application to Simulations
|
||||
- [ ] Task: Update Simulator
|
||||
- [ ] WHERE: `simulation/workflow_sim.py`
|
||||
- [ ] WHAT: Inject the `UserSimAgent` into the standard workflow steps (e.g., waiting before approving a ticket).
|
||||
- [ ] HOW: Call the agent's delay methods before executing `ApiHookClient` commands.
|
||||
- [ ] SAFETY: None.
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 2: Simulator Integration'
|
||||
|
||||
## Phase 3: Final Validation
|
||||
- [ ] Task: Watch Simulation
|
||||
- [ ] WHERE: Terminal
|
||||
- [ ] WHAT: Run `python simulation/sim_execution.py` locally and observe the pacing.
|
||||
- [ ] HOW: Verify it feels more human.
|
||||
- [ ] SAFETY: None.
|
||||
- [ ] Task: Conductor - User Manual Verification 'Phase 3: Final Validation'
|
||||
@@ -0,0 +1,12 @@
|
||||
# Specification: Simulation Fidelity Enhancement
|
||||
|
||||
## Background
|
||||
The `simulation/user_agent.py` currently relies on fixed random delays to simulate human typing. As identified in the architecture audit, this provides a low-fidelity simulation of actual user interactions, which may hide UI rendering glitches that only appear when ImGui is forced to render intermediate, hesitating states.
|
||||
|
||||
## Objective
|
||||
Enhance the `UserSimAgent` to behave more like a human, introducing realistic jitter, hesitation, and reading delays.
|
||||
|
||||
## Requirements
|
||||
1. **Variable Reading Latency:** Calculate artificial delays based on the length of the AI's response to simulate the user reading the text before clicking next.
|
||||
2. **Typing Jitter:** Instead of just injecting text instantly, simulate keystrokes with slight random delays if testing input fields (optional, but good for stress testing the render loop).
|
||||
3. **Hesitation Vectors:** Introduce a random chance for a longer "hesitation" delay (e.g., 2-5 seconds) before critical actions like "Approve Script".
|
||||
Reference in New Issue
Block a user