d20e1c2e78
metadata.json: standard track metadata (15 fields per the live_gui_test_fixes_20260618 precedent; includes scope, depends_on, blocks, out_of_scope, tolerated_at_run_time, test_summary, verification_criteria, 10 risks). state.toml: initial state (status=active, current_phase=0; 14 phases pending; 19 verification flags all false). TIER2_STARTUP.md: the per-track readme for the Tier 2 agent. Track-specific supplement to conductor/tier2/agents/tier2-autonomous.md. Covers: what to load (plan_v2.md first, spec_v2.md second; do NOT load v1 spec/plan), hard bans (3-layer), conventions, TDD protocol, per-task commit protocol, pre-delegation checkpoint, failcount contract, 8 known gotchas, verification protocol, end-of-track handoff, out-of-scope restatement. EXPLICITLY NOTES: - any_type_componentization_20260621 + phase2_4_5_call_site_completion_20260621 are NOT on master (mergedf914b2bc, reverted751b94d4). v2 audit is tolerant of their absence. - The 3 candidate aggregates (ToolSpec, ChatMessage, ProviderHistory) are forward-compat placeholders with is_candidate: True. The integration tests verify the placeholder format (synthesize_aggregate_profile() in Phase 9 Task 9.2 has the template hard-coded). - The 1-line extension to scripts/audit_optional_in_3_files.py is the audit gate; skipping Phase 12 Task 12.2 leaves the new file uncovered by the Optional[T] ban. Total v2 artifacts (committed): - spec_v2.md (460 lines) - plan_v2.md (5006 lines) - metadata.json - state.toml - TIER2_STARTUP.md
201 lines
12 KiB
JSON
201 lines
12 KiB
JSON
{
|
|
"id": "code_path_audit_20260607",
|
|
"title": "Code Path & Data Pipeline Audit v2",
|
|
"type": "tooling",
|
|
"status": "active",
|
|
"priority": "A",
|
|
"created": "2026-06-07",
|
|
"last_revised": "2026-06-22",
|
|
"owner": "tier2-tech-lead",
|
|
"parent_umbrella": null,
|
|
"spec": "conductor/tracks/code_path_audit_20260607/spec_v2.md",
|
|
"plan": "conductor/tracks/code_path_audit_20260607/plan_v2.md",
|
|
"spec_v1_preserved": "conductor/tracks/code_path_audit_20260607/spec.md (v1, never executed; preserved unchanged)",
|
|
"plan_v1_preserved": "conductor/tracks/code_path_audit_20260607/plan.md (v1, never executed; preserved unchanged)",
|
|
"v2_revision_rationale": "v1 was authored 2026-06-07 before the 4 foundational tracks shipped; v1 framing is now stale. v2 re-scopes the audit from 'expensive operations per action' to 'data pipelines per aggregate' + a decomposition-cost heuristic (componentize vs unify) per aggregate. v2 also cross-validates data_structure_strengthening + data_oriented_error_handling directly (the 2 foundational tracks didn't exist on 2026-06-07).",
|
|
"scope": {
|
|
"files_created": 17,
|
|
"files_created_paths": [
|
|
"src/code_path_audit.py",
|
|
"tests/test_code_path_audit.py",
|
|
"tests/test_code_path_audit_live_gui.py",
|
|
"tests/fixtures/synthetic_src/__init__.py",
|
|
"tests/fixtures/synthetic_src/type_aliases.py",
|
|
"tests/fixtures/synthetic_src/ai_client.py",
|
|
"tests/fixtures/synthetic_src/aggregate.py",
|
|
"tests/fixtures/synthetic_src/gui_2.py",
|
|
"tests/fixtures/synthetic_src/cleanup.py",
|
|
"tests/fixtures/synthetic_src/overrides.toml",
|
|
"tests/fixtures/audit_inputs/audit_weak_types.json",
|
|
"tests/fixtures/audit_inputs/audit_exception_handling.json",
|
|
"tests/fixtures/audit_inputs/audit_optional_in_3_files.json",
|
|
"tests/fixtures/audit_inputs/audit_no_models_config_io.json",
|
|
"tests/fixtures/audit_inputs/audit_main_thread_imports.json",
|
|
"tests/fixtures/audit_inputs/type_registry.json",
|
|
"scripts/audit_code_path_audit_coverage.py",
|
|
"conductor/code_styleguides/code_path_audit.md"
|
|
],
|
|
"files_modified": 1,
|
|
"files_modified_paths": [
|
|
"scripts/audit_optional_in_3_files.py (+1 line: add src/code_path_audit.py to the baseline list)"
|
|
],
|
|
"files_preserved_v1": [
|
|
"conductor/tracks/code_path_audit_20260607/spec.md (v1)",
|
|
"conductor/tracks/code_path_audit_20260607/plan.md (v1)"
|
|
],
|
|
"phases": 14,
|
|
"tasks": 85,
|
|
"tests_total": 91,
|
|
"tests_unit": 84,
|
|
"tests_integration": 7,
|
|
"tests_live_gui_opt_in": 2,
|
|
"aggregates_total": 13,
|
|
"aggregates_real": 10,
|
|
"aggregates_candidate": 3,
|
|
"rollups": 4,
|
|
"follow_up_tracks": 5
|
|
},
|
|
"depends_on": [
|
|
"data_oriented_error_handling_20260606 (SHIPPED; the v2 audit's result_coverage cross-checks this)",
|
|
"data_structure_strengthening_20260606 (SHIPPED; the v2 audit's type_alias_coverage cross-checks this)",
|
|
"mcp_architecture_refactor_20260606 (SHIPPED; provides the 6 input audit scripts' baselines)",
|
|
"qwen_llama_grok_integration_20260606 (SHIPPED; the v2 audit covers the 8 _send_<vendor> functions)",
|
|
"result_migration_20260616 (100% complete as of 2026-06-21; the v2 audit runs against the post-migration src/)"
|
|
],
|
|
"blocks": [
|
|
"pipeline_runtime_profiling_20260607 (preserved from v1; calibrates v2's heuristic cost constants against real measurements)",
|
|
"data_pipelines_inventory_<date> (per-pipeline vs per-aggregate reports for the top 5 pipelines)",
|
|
"code_path_audit_in_ci_<date> (run v2 in CI on every PR)",
|
|
"code_path_audit_data_oriented_refactor_<date> (implement the 3 high-priority componentize candidates)",
|
|
"code_path_audit_v2_5_followup_<date> (re-run v2 after any_type_componentization_20260621 merges)"
|
|
],
|
|
"out_of_scope": [
|
|
"No modifications to existing src/*.py files (read-only on the 65 existing files; the v2 audit doesn't change them).",
|
|
"No modifications to the 5 existing audit scripts (consume their JSON; don't change them).",
|
|
"No runtime profiling (deferred to pipeline_runtime_profiling_20260607).",
|
|
"No new pip dependencies (stdlib only: ast, pathlib, json, dataclasses, tomllib, re).",
|
|
"No changes to data_structure_strengthening or data_oriented_error_handling styleguides.",
|
|
"No changes to v1 spec.md or plan.md (v1 preserved unchanged).",
|
|
"No MMA worker spawn action (preserved from v1; user directive 2026-06-07: cold until 1:1 discussion UX is dogfooded).",
|
|
"No new src/<thing>.py files (per AGENTS.md file size + naming convention: helpers and sub-systems go in the parent module).",
|
|
"The 23 lower-impact files (1-9 weak-type sites each; deferred to a follow-up track).",
|
|
"The 3 candidate aggregates' 'real' analysis (deferred to code_path_audit_v2_5_followup_<date>).",
|
|
"The v1-style per-action output is preserved for backward compat but downgraded to cross-references."
|
|
],
|
|
"tolerated_at_run_time": [
|
|
"any_type_componentization_20260621 is NOT on master (merged f914b2bc, reverted 751b94d4); the v2 audit produces placeholders for the 3 candidate aggregates with is_candidate: True.",
|
|
"phase2_4_5_call_site_completion_20260621 is NOT on master (same merge+revert history).",
|
|
"Missing input JSONs in tests/artifacts/audit_inputs/ are tolerated (the corresponding cross_audit_findings field is empty; the markdown notes the absence).",
|
|
"Malformed input JSONs are tolerated (the read_input_json() returns Result with errors; the v2 audit continues with empty data)."
|
|
],
|
|
"test_summary": {
|
|
"tests_total": 91,
|
|
"tests_unit": 84,
|
|
"tests_integration": 7,
|
|
"tests_live_gui_opt_in": 2,
|
|
"test_tier_count": 11,
|
|
"test_pass_count_target": "All 91 tests PASS; the 2 live_gui are opt-in (CODE_PATH_AUDIT_LIVE_GUI=1)"
|
|
},
|
|
"verification_criteria": [
|
|
"FR-1: src/code_path_audit.py is created with the 11 public functions + 4 static analyzers (PCG, MemoryDim, APD, CFE) + 4 renderers (to_dsl_v2, to_markdown, to_tree, parse_dsl_v2) + run_audit() main entry + CLI + MCP tool wrapper",
|
|
"FR-2: All 11 public functions return Result[T] per error_handling.md (or return a deterministic T when no runtime failure is possible)",
|
|
"FR-3: The 4 audit gates pass in --strict mode (audit_exception_handling, audit_weak_types, audit_main_thread_imports, audit_no_models_config_io)",
|
|
"FR-4: The meta-audit (scripts/audit_code_path_audit_coverage.py) passes on the real audit output (0 schema violations)",
|
|
"FR-5: The type registry is in sync with src/type_aliases.py (scripts/generate_type_registry.py --check exits 0)",
|
|
"FR-6: 91 tests pass (84 unit + 7 integration; 2 live_gui are opt-in)",
|
|
"FR-7: The audit output (13 per-aggregate .dsl + .md + .tree files + 4 rollups) is committed to docs/reports/code_path_audit/2026-06-22/",
|
|
"FR-8: The TRACK_COMPLETION report is written to docs/reports/TRACK_COMPLETION_code_path_audit_20260622.md",
|
|
"FR-9: conductor/tracks.md is updated with the v2 track entry (the checkpoint SHA from the TRACK_COMPLETION report commit)",
|
|
"FR-10: The 1-line extension to scripts/audit_optional_in_3_files.py is committed; the extended audit passes in --strict mode",
|
|
"FR-11: conductor/code_styleguides/code_path_audit.md is written (the 5-convention styleguide)",
|
|
"Atomic per-task commits with git notes per conductor/workflow.md step 9.1-9.3",
|
|
"No day estimates, no T-shirt sizes in any artifact"
|
|
],
|
|
"risks": [
|
|
{
|
|
"id": "R1",
|
|
"description": "The decomposition-cost heuristic is inaccurate (componentize_savings overestimate or underestimate)",
|
|
"mitigation": "The runtime-profiling follow-up recalibrates. The override file (scripts/code_path_audit_overrides.toml) lets the user adjust per-aggregate. The summary.md and decomposition_matrix.md headers caveat: 'Savings estimates are heuristic; use as ranking input, not as actual savings.'"
|
|
},
|
|
{
|
|
"id": "R2",
|
|
"description": "The PCG misses dynamic patterns (eval, getattr, decorator-driven dispatch like @imscope)",
|
|
"mitigation": "The override file lists the known passthroughs. The runtime-profiling follow-up catches the unresolved. The v1 spec's 'unresolved_calls' pattern is preserved."
|
|
},
|
|
{
|
|
"id": "R3",
|
|
"description": "The 6 input JSON contracts drift (the existing audit scripts evolve without bumping the v2 audit's contract)",
|
|
"mitigation": "The scripts/audit_code_path_audit_coverage.py meta-audit runs in CI; fails on schema drift. The v2 audit tolerates missing fields (returns empty cross_audit_findings; markdown notes the absence)."
|
|
},
|
|
{
|
|
"id": "R4",
|
|
"description": "The candidate aggregates don't merge (any_type_componentization_20260621 is delayed)",
|
|
"mitigation": "The v2 audit is forward-compatible. The is_candidate: bool flag handles the absence gracefully. The candidates.md rollup explains the placeholder status."
|
|
},
|
|
{
|
|
"id": "R5",
|
|
"description": "The v1 .dsl files don't round-trip (the v2 parser is more strict than v1)",
|
|
"mitigation": "The v2 parser is a superset of v1; the v1 action reports still parse. The test_v2_dsl_backward_compat_v1 test verifies."
|
|
},
|
|
{
|
|
"id": "R6",
|
|
"description": "The synthetic src/ fixture diverges from real src/ (the test expectations don't generalize)",
|
|
"mitigation": "The integration test layer runs against real src/ as well as the synthetic fixture. The 2 are decoupled."
|
|
},
|
|
{
|
|
"id": "R7",
|
|
"description": "The 4 audit gates regress during implementation (Tier 3 worker adds a try/except violation, Optional[T] return, etc.)",
|
|
"mitigation": "Run the 4 audit gates in --strict mode after every commit. If a gate fails, fix before continuing. The audit scripts are the 'laws of physics' for the new file."
|
|
},
|
|
{
|
|
"id": "R8",
|
|
"description": "The 85+ tasks exceed Tier 2's per-task context window (the model runs out of memory mid-track)",
|
|
"mitigation": "Per-task commits are atomic; the failcount state file persists progress. The per-task commit discipline means each commit is a safe rollback point. If a task fails 3 times, escalate to the user (don't keep retrying)."
|
|
},
|
|
{
|
|
"id": "R9",
|
|
"description": "The 91 tests are too long-running for the per-PR CI gate (the user expects <2 min for unit tests)",
|
|
"mitigation": "The unit + integration tests run in <30s. The live_gui tests are opt-in via the CODE_PATH_AUDIT_LIVE_GUI env var. The 2 opt-in tests are not in the default run."
|
|
},
|
|
{
|
|
"id": "R10",
|
|
"description": "The Tier 2 agent uses a git command that is hard-banned (git restore, git checkout, git reset, git push)",
|
|
"mitigation": "The 3-layer hard ban enforcement (OpenCode permission + Windows restricted token + git hooks) catches the violation. The TIER2_STARTUP.md restates the hard bans. If a task requires one, escalate to the user."
|
|
}
|
|
],
|
|
"out_of_scope": [
|
|
"Modifications to existing src/*.py files (read-only on the 65 existing files)",
|
|
"Modifications to the 5 existing audit scripts (consume their JSON; don't change them)",
|
|
"Runtime profiling (deferred to pipeline_runtime_profiling_20260607)",
|
|
"New pip dependencies (stdlib only)",
|
|
"Changes to data_structure_strengthening or data_oriented_error_handling styleguides",
|
|
"Changes to v1 spec.md or plan.md (v1 preserved)",
|
|
"MMA worker spawn action (cold per user)",
|
|
"New src/<thing>.py files (per AGENTS.md file size + naming convention)",
|
|
"The 23 lower-impact files (deferred)",
|
|
"The 3 candidate aggregates' real analysis (deferred to v2.5 follow-up)"
|
|
],
|
|
"follow_up_tracks": [
|
|
{
|
|
"id": "pipeline_runtime_profiling_20260607",
|
|
"purpose": "Calibrate v2's heuristic cost constants against real measurements. Uses src/performance_monitor.py."
|
|
},
|
|
{
|
|
"id": "data_pipelines_inventory_<date>",
|
|
"purpose": "Per-pipeline (vs per-aggregate) reports for the top 5 pipelines."
|
|
},
|
|
{
|
|
"id": "code_path_audit_in_ci_<date>",
|
|
"purpose": "Run v2 in CI on every PR; fail on new untyped sites or decomposition-matrix regression."
|
|
},
|
|
{
|
|
"id": "code_path_audit_data_oriented_refactor_<date>",
|
|
"purpose": "Implement the 3 high-priority componentize candidates (FileItems, History, Metadata)."
|
|
},
|
|
{
|
|
"id": "code_path_audit_v2_5_followup_<date>",
|
|
"purpose": "Re-run v2 after any_type_componentization_20260621 merges; the 3 placeholders become real profiles."
|
|
}
|
|
]
|
|
}
|