ed/manual_slop

Private

Public Access

Fork 0

Files

T

ed 88e44d1c0e docs(report): add session report (audit + migration plan + tech-rot prevention)

2026-06-16 10:48:15 -04:00

10 KiB

Raw Blame History

Session Report: Exception Handling Audit + Migration Planning + Tech-Rot Prevention

Date: 2026-06-16 Total commits: 17 (1 pre-existing todo + 16 new) Tracks shipped: 2 (rag_test_failures_20260615 Tier 1 review; exception_handling_audit_20260616 full execution) Tracks planned: 1 umbrella (result_migration_20260616, with 5 sub-tracks) Doc updates: 5 (styleguide + product-guidelines + docs/AGENTS + tracks.md + AGENTS.md) Process rules added: 1 (HARD BAN on day estimates in track artifacts)

Scope executed

This session executed 4 distinct work-streams:

Tier 1 review of rag_test_failures_20260615 — verified the 2-line fix in src/rag_engine.py, validated the docs update in docs/guide_rag.md, confirmed test pass count (1288 + 4 + 0 = first fully green baseline since 2026-06-12). Found 1 minor metadata inaccuracy (the metadata listed src/app_controller.py in modified_files but no production change occurred there; the change was in src/rag_engine.py only).
exception_handling_audit_20260616 track — built a 792-line AST-based static analyzer (scripts/audit_exception_handling.py) that classifies every try/except/finally/raise site in 65 src/ files against a 10-category taxonomy. Identified 268 "bad" sites (211 violations + 25 suspicious + 32 unclear) across 42 files. The 3 fully-refactored files (mcp_client.py, ai_client.py, rag_engine.py) are the convention baseline; the other 62 files are migration target. Closed 5 doc gaps the audit revealed.
5-track migration plan — estimated that 5 sub-tracks are needed to eliminate all 268 "bad" sites, organized under a result_migration_20260616 umbrella with the consistent result_migration_* prefix. Each sub-track sized by scope + T-shirt size (not day estimates, per the new Tier 1 rule added this session).
Tech-rot prevention — added 4 enforcement mechanisms (styleguide checklist + product-guidelines obligations + docs/AGENTS.md enforcement section + audit script --ci flag) so future AI agents writing new code don't revert to idiomatic Python patterns.

What was built

Static analyzer: `scripts/audit_exception_handling.py` (792 lines)

AST-based, not regex. 10-category classification:

5 compliant: BOUNDARY_SDK, BOUNDARY_IO, BOUNDARY_CONVERSION, BOUNDARY_FASTAPI, INTERNAL_PROGRAMMER_RAISE, INTERNAL_COMPLIANT
3 violation: INTERNAL_SILENT_SWALLOW, INTERNAL_BROAD_CATCH, INTERNAL_OPTIONAL_RETURN
1 suspicious: INTERNAL_RETHROW
1 unclear: UNCLEAR

6 output modes: default human-readable, --json, --summary (per-file table), --by-size (migration-effort buckets), --strict/--ci (CI gate), --include-tests, --include-baseline, --exclude.

The audit report: `docs/reports/EXCEPTION_HANDLING_AUDIT_20260616.md` (370 lines)

9 sections. The headline: 348 total sites / 80 compliant (23%) / 25 suspicious (7%) / 211 violations (61%) / 32 unclear (9%). Baseline (3 refactored files) has 112 sites / 77 violations. Migration target (62 other files) has 236 sites / 134 violations.

5 doc updates (the tech-rot prevention)

File	What was added
`conductor/code_styleguides/error_handling.md`	"AI Agent Checklist" — 5 MUST-DO + 7 MUST-NOT-DO + 3 boundary patterns + pre-commit gate
`conductor/product-guidelines.md`	"AI Agent Obligations" — 4 enforcement mechanisms + 4 audit scripts table + pre-commit workflow
`docs/AGENTS.md`	"Convention Enforcement" section AT THE TOP of the file — first thing AIs see
`conductor/tracks.md`	Registered `result_migration_20260616` umbrella (row 6d) + detail section
`scripts/audit_exception_handling.py`	Added `--ci` alias for `--strict`; updated docstring to explain CI-gate mode

The 5-track migration plan (`result_migration_20260616` umbrella)

Consistent result_migration_* prefix for all 5 sub-tracks:

#	Sub-track	T-shirt	Scope
1	`result_migration_review_pass`	S	57 sites (32 UNCLEAR + 25 INTERNAL_RETHROW) across 15 files
2	`result_migration_small_files`	L	37 files (35 SMALL + 2 MEDIUM); 72 V+S sites
3	`result_migration_app_controller`	XL	56 sites in 1 file (166KB)
4	`result_migration_gui_2`	XL	54 sites in 1 file (260KB)
5	`result_migration_baseline_cleanup`	L	112 sites in 3 refactored files

Total: 5 sub-tracks, 268 sites across 42 files, ~2100 lines changed.

Sequence: 1 (review) → 2 (small files) → 3 (app_controller) → 4 (gui_2) → 5 (baseline cleanup). Tracks 2 + 5 can run in parallel; tracks 3 + 4 must be sequential (the GUI calls controller methods).

Process rule: HARD BAN on day estimates

Codified in AGENTS.md (Critical Anti-Patterns, HARD BAN entry) and conductor/workflow.md (new "Tier 1 Track Initialization Rules" section, 113 lines).

Why this matters: day estimates are inaccurate noise. Tier 2 capacity is bounded by attention, not time. The user called this out explicitly: "Day estimates are inaccurate. Tier-2s can only do so much in a single track and there is no way in hell its going to be 'DAYS'."

The rule: measure effort by scope (N files, M sites, N tasks) and T-shirt size (S/M/L/XL). The user / Tier 2 agent decides the actual pacing.

Cleanup applied retroactively: stripped day estimates from the 2 previously-shipped tracks (rag_test_failures_20260615 and exception_handling_audit_20260616).

Critical findings (the audit's most important discoveries)

test_rag_visual_sim.py::test_rag_full_lifecycle_sim was already passing at track execution time, contrary to the spec's claim. The parent track's incidental fixes had already resolved it.
src/app_controller.py has 13 FastAPI boundary sites that are LEGITIMATE (per the new "Boundary Types" section in the styleguide), not migration-target. The 22 remaining sites ARE migration-target.
The convention is partially applied even in the 3 refactored files: 77 violations remain in mcp_client.py (44), ai_client.py (27), rag_engine.py (6). These are the parent's "Path C deferred work" + the SDK-exception-classification helpers in ai_client.py + the non-*_result methods in rag_engine.py. Sub-track 5 (baseline_cleanup) closes these.
The 268-site inventory is the canonical migration target. Per-file breakdown (top 5):
- src/gui_2.py: 54 sites (37 V + 2 S + 13 ?)
- src/app_controller.py: 56 sites (35 V + 3 S + 2 ? + 16 C; 13 FastAPI boundary)
- src/session_logger.py: 8 sites (8 V)
- src/warmup.py: 7 sites (6 V + 1 S)
- src/mcp_client.py: 53 sites (44 V; BASELINE)
The audit's heuristics had bugs that the Tier 1 review caught: raise HTTPException(...) was misclassified as INTERNAL_RETHROW because ast.unparse(node.exc) returns the full call expression, not just the class name. Fixed in the audit script.

State

Branch: master (16 new commits, all atomic, all with git notes)
Test pass count: 1288 + 4 + 0 (unchanged from rag_test_failures_20260615; this session was informational + planning + docs)
Convention status: 3 of 65 src/ files are convention-compliant (the baseline); 62 are migration-target. After all 5 result_migration_* sub-tracks ship, the convention will be applied to all 65 files.
Pre-existing modified files (NOT touched this session): config.toml, manualslop_layout.ini, project_history.toml — same 3 files mentioned in the rag_test_failures_20260615 completion report as out of scope.

Followup recommendations (for the next session / Tier 2)

Start sub-track 1 (result_migration_review_pass): a small (S) informational sub-track that reviews the 32 UNCLEAR + 25 INTERNAL_RETHROW sites, updates the audit's heuristics, and produces a per-site decision table. T-shirt size S, no day estimate. No production code change. This is the natural first sub-track to execute.
Then sub-tracks 2-5 in sequence (small files → app_controller → gui_2 → baseline cleanup). Each is a refactor with tests; all have the convention's 4 enforcement mechanisms to prevent new violations.
After sub-track 5 ships: wire audit_exception_handling.py --strict (or --ci) into pre-commit hooks + CI. At that point the project has 0 violations and the script returns 0; --strict mode becomes a meaningful CI gate.
Then the user's stated manual refactor: send_result → send mass rename. Mechanical find-replace; no behavior change.
Then data_structure_strengthening_20260606 (the TypeAlias / NamedTuple track, parallel to result_migration; uses the cleaner Result API from this phase).

Closing note

The session started with a Tier 1 review (verify someone else's work). It grew into: a new track (audit + 5 doc gaps), an umbrella track for the migration phase (5 sub-tracks), a process rule (no day estimates), and 5 doc updates to prevent tech rot. 17 commits, 4 lifecycle stages, 0 test regressions. The project is now at a fully green baseline (1288 + 4 + 0) and the convention has 4 enforcement mechanisms to keep it that way.

The next Tier 1 session should start with sub-track 1 (result_migration_review_pass); everything else is in place.

10 KiB Raw Blame History