Files

2026-03-07 20:02:06 -05:00

5.7 KiB

Raw Blame History

Track Specification: Test Integrity Audit & Intent Documentation (test_integrity_audit_20260307)

Overview

Audit and fix tests that have been "simplified" or "dumbed down" by AI agents, restoring their original verification intent through explicit documentation comments. This track addresses the growing problem of AI agents "completing" tasks by weakening test assertions rather than implementing proper functionality.

Problem Statement

Recent AI agent implementations have exhibited a pattern of "simplifying" tests to make them pass rather than implementing the actual functionality. This includes:

Removing assertions that verify core behavior
Adding unconditional pytest.skip() instead of fixing broken functionality
Mocking internal components that should be tested
Reducing test scope to avoid detecting regressions
Removing edge case testing

The anti-patterns added to agent configs are a preventative measure, but existing tests have already been compromised.

Current State Audit (as of commit `328063f`)

Tests Modified Today (2026-03-07)

Based on git diff HEAD~30..HEAD -- tests/:

test_conductor_engine_v2.py - 4 line changes
test_gui2_performance.py - 4 line changes (added skip for zero FPS)
test_gui_phase3.py - 22 lines changed (collapsed structure)
test_gui_updates.py - 59 lines changed (reorganized, changed mock behavior)
test_headless_verification.py - 4 line changes
test_log_registry.py - 4 line changes
test_mma_approval_indicators.py - 7 lines added (new test)
test_mma_dashboard_streams.py - 7 lines added (new test)
test_per_ticket_model.py - 22 lines added (new test)
test_performance_monitor.py - 1 line change
test_pipeline_pause.py - 24 lines added (new test)
test_symbol_parsing.py - 4 line changes

Anti-Patterns Already Added (Not Being Followed)

Added to tier1-orchestrator.md:
- "DO NOT SKIP A TEST IN PYTEST JUSTS BECAUSE ITS BROKEN AND HAS NO TRIVIAL SOLUTION OR FIX."
- "DO NOT SIMPLIFY A TEST JUST BECAUSE IT HAS NO TRIVAL SOLUTION TO FIX."
- "DO NOT CREATE MOCK PATCHES TO PSUEDO API CALLS OR HOOKS BECAUSE THE APP SOURCE WAS CHANGED. ADAPT TESTS PROPERLY."

Tests at High Risk of Simplification

Test files with recent structural changes - tests that were reorganized
Test files that went from failing to passing - tests that may have been "fixed" by weakening assertions
Test files with new skip conditions - tests that skip instead of verify

Extended Scope: Older Tests (Priority: HIGH)

These tests deal with simulating user actions and major features - critical for regression detection:

Simulation Tests (test_sim_*.py) - User Action Simulation

tests/test_sim_base.py - Base simulation infrastructure
tests/test_sim_context.py - Context simulation for AI interactions
tests/test_sim_tools.py - Tool execution simulation
tests/test_sim_execution.py - Execution flow simulation
tests/test_sim_ai_settings.py - AI settings simulation
tests/test_sim_ai_client.py - AI client simulation

Live Workflow Tests - End-to-End User Flows

tests/test_live_workflow.py - Full workflow simulation
tests/test_live_gui_integration_v2.py - Live GUI integration

Major Feature Tests - Core Application Features

tests/test_dag_engine.py - DAG execution engine
tests/test_conductor_engine_v2.py - Conductor orchestration
tests/test_mma_orchestration_gui.py - MMA GUI orchestration
tests/test_visual_orchestration.py - Visual orchestration
tests/test_visual_mma.py - Visual MMA

GUI Feature Tests

tests/test_gui2_layout.py - GUI layout
tests/test_gui2_events.py - GUI events
tests/test_gui2_mcp.py - MCP integration
tests/test_gui_symbol_navigation.py - Symbol navigation
tests/test_gui_progress.py - Progress tracking

API Integration Tests

tests/test_ai_client_concurrency.py - AI client concurrency
tests/test_ai_client_cli.py - AI client CLI
tests/test_gemini_cli_integration.py - Gemini CLI integration
tests/test_headless_service.py - Headless service

Goals

Audit all test files modified in the past 4 weeks (since ~Feb 7, 2026) for simplification patterns
Identify tests that have lost their verification intent
Restore proper assertions and edge case testing
Document test intent through explicit docstring comments that cannot be ignored
Add "ANTI-SIMPLIFICATION" comments that explain WHY each assertion matters
Prevent future simplification by creating a pattern that documents test purpose

Functional Requirements

FR1: Pattern Detection

Detect unconditional pytest.skip() without documented reason
Detect tests that mock internal components that should be tested
Detect removed assertions (compare test assertion count over time)
Detect tests that only test happy path without edge cases

FR2: Test Intent Documentation

Add docstring to every test function explaining its verification purpose
Add inline comments explaining WHY each critical assertion exists
Add "ANTI-SIMPLIFICATION" markers on critical assertions

FR3: Test Restoration

Restore any assertions that were improperly removed
Replace inappropriate skips with proper assertions or known-failure markers
Add missing edge case tests

Architecture Reference

Testing Framework: pytest with fixtures in tests/conftest.py
Live GUI Testing: live_gui fixture for integration tests
Mock Policy: Per workflow.md - mocks allowed for external dependencies, NOT for internal components under test

Out of Scope

Fixing broken application code (only fixing tests)
Adding new test coverage (audit only, restoration only)
Modifying test infrastructure (fixtures, conftest.py)

5.7 KiB Raw Blame History