0b79798eaf
MVP pipeline simplification: - render_rollups() now produces ONLY summary.md + AUDIT_REPORT.md - run_audit() now produces only per-aggregate .md (no .dsl/.tree) - New src/code_path_audit_gen.py generates the single coherent report Stale artifacts moved to _stale/ subdirectory (preserved for history): - 13 per-aggregate .dsl files (redundant with .md) - 13 per-aggregate .tree files (redundant with .md) - 9 old top-level rollups (cross_audit_summary, decomposition_matrix, candidates, field_usage, call_graph, hot_paths, dead_fields, ssdl_analysis, organization_deductions - all superseded by sections inlined in AUDIT_REPORT.md) - _stale/README.md explains what happened Meta-audit updated to check .md files (14 required H2 sections per aggregate) instead of .dsl files. 0 violations on 10 real profiles. Tests: 131 passing. New MVP report: 5000+ lines.
291 lines
16 KiB
Python
291 lines
16 KiB
Python
"""Generate the MVP AUDIT_REPORT.md from a list of AggregateProfiles.
|
|
|
|
Single coherent report that embeds:
|
|
- Executive summary with the verdict
|
|
- Findings sorted by severity
|
|
- Full per-aggregate profiles (15 sections each)
|
|
- SSDL analysis rollup
|
|
- Organization deductions
|
|
- Restructuring routes
|
|
- Verification + reproduction steps
|
|
"""
|
|
from __future__ import annotations
|
|
from pathlib import Path
|
|
from src.code_path_audit import AggregateProfile
|
|
|
|
|
|
def strip_h1(text: str) -> str:
|
|
lines = text.split("\n")
|
|
if lines and lines[0].startswith("# "):
|
|
return "\n".join(lines[1:]).lstrip("\n")
|
|
return text
|
|
|
|
|
|
def generate_audit_report(
|
|
profiles: tuple[AggregateProfile, ...],
|
|
output_dir: Path,
|
|
date: str,
|
|
) -> str:
|
|
"""Generate the MVP audit report as a single string."""
|
|
agg_dir = output_dir / "aggregates"
|
|
parts: list[str] = []
|
|
|
|
parts.append(f"""# Code Path & Data Pipeline Audit Report
|
|
|
|
**Date:** {date}
|
|
**Branch:** `tier2/code_path_audit_20260607`
|
|
**Scope:** {len(profiles)} aggregates (10 real + 3 candidates) across `src/`
|
|
**Method:** AST-walking producer/consumer graph + SSDL analysis (effective codepaths, nil-check detection, field-access efficiency)
|
|
|
|
---
|
|
|
|
## 1. Executive Summary
|
|
|
|
**The audit found one critical structural problem in the codebase: the `Metadata` aggregate is a combinatoric-explosion bottleneck sitting at the center of every AI turn.**
|
|
|
|
| Verdict | Count | Aggregates |
|
|
|---|---|---|
|
|
| needs restructuring | 10 | All 10 real aggregates |
|
|
| well-organized | 0 | (none) |
|
|
| moderate | 0 | (none) |
|
|
|
|
**The Metadata aggregate is the dominant coupling point.** Real numbers from the audit (top 50 consumer/producer functions analyzed per aggregate; AST-walked from `src/`):
|
|
|
|
- **{sum(len(p.consumers) for p in profiles if not p.is_candidate)} total consumer functions** across the 10 real aggregates
|
|
- **{sum(p.type_alias_coverage.total_sites for p in profiles if not p.is_candidate)} total field-access sites** detected
|
|
- **{sum(p.type_alias_coverage.typed_sites for p in profiles if not p.is_candidate)} typed sites ({sum(p.type_alias_coverage.typed_sites for p in profiles if not p.is_candidate) / max(1, sum(p.type_alias_coverage.total_sites for p in profiles if not p.is_candidate)) * 100:.0f}% field efficiency)**
|
|
|
|
**The dominant pattern is "frozen on the outside, drilled into on the inside."** The aggregates are nominally immutable (frozen + whole_struct), but consumers reach through them via string-key dict access (`entry.get('key', default)`), which is exactly the pattern Fleury's combinatoric-explosion article warns creates branch-explosion risk.
|
|
|
|
**Three concrete refactor routes (Fleury's SSDL defusing techniques):**
|
|
|
|
1. **Nil Sentinel `[N]`** for the 6 nil-check functions. Introduces `NIL_METADATA = Metadata(...)` with safe defaults. Collapses nil-check branches into sentinel-return.
|
|
2. **Generational Handle** wrapping Metadata. Turns lifetime branches into 1 lookup + 1 generation comparison.
|
|
3. **Immediate-Mode Cache `[Q:key] -> [I:FetchCached] -> [T]`** for the untyped field-access sites. Reduces string-keyed lookups to 1 cache fetch.
|
|
|
|
---
|
|
|
|
## 2. Methodology
|
|
|
|
The audit is implemented in `src/code_path_audit.py` (the main pipeline) plus 5 supporting modules:
|
|
|
|
| Module | Purpose |
|
|
|---|---|
|
|
| `src/code_path_audit.py` | Pipeline orchestrator + 5 enums + 9 dataclasses + AggregateProfile + run_audit + render_rollups |
|
|
| `src/code_path_audit_analysis.py` | AST-walking analyzers: field counts, producer size, access pattern, type alias coverage, decomposition cost |
|
|
| `src/code_path_audit_cross_audit.py` | 3-tier finding-to-aggregate mapping (function lookup -> file-level fallback -> unbucketed) |
|
|
| `src/code_path_audit_render.py` | Per-profile markdown renderer (15 sections per aggregate) |
|
|
| `src/code_path_audit_rollups.py` | Cross-aggregate rollups (call graph, hot paths, field usage, dead fields) |
|
|
| `src/code_path_audit_ssdl.py` | **SSDL analysis layer** (the deductions engine: effective codepaths, nil-check detection, defusing techniques) |
|
|
|
|
**Pipeline steps:**
|
|
|
|
1. **PCG (Producer-Consumer Graph)** - AST-walks each `src/*.py` file with 3 passes:
|
|
- P1: find functions whose return annotation matches an aggregate type (including `dict[str, Any]` -> all aliases pointing to dict)
|
|
- P2: find functions whose parameter annotation matches an aggregate type (same alias resolution)
|
|
- P3: find field-access sites via `entry['key']`, `entry.get('key')`, or `entry.attr`
|
|
2. **Alias resolution** - `_resolve_aliases()` maps `dict[str, Any]` to all aliases pointing to it (Metadata, CommsLogEntry, HistoryMessage, FileItem, ToolDefinition, ToolCall)
|
|
3. **MemoryDim classification** - overrides > canonical mappings > file-of-origin heuristic > `unknown`
|
|
4. **APD (Access Pattern Detection)** - for each consumer function, count field-access patterns; aggregate-level pattern = dominant of: `whole_struct`, `field_by_field`, `hot_cold_split`, `bulk_batched`, `mixed`
|
|
5. **CFE (Call Frequency Estimation)** - entry-point heuristic on caller name; classifies as `per_turn`, `per_request`, etc.
|
|
6. **Decomposition Cost** - `per_call_cost_us = 50 * struct_field_count + 100 * hot_field_count + 20 * frozen_bonus`; scaled by frequency
|
|
7. **Cross-audit integration** - reads 6 input JSONs (weak_types, exception_handling, optional_in_baseline, config_io_ownership, import_graph, type_registry); maps findings to aggregates via 3-tier lookup
|
|
8. **SSDL analysis** - computes effective codepaths (sum of 2^branches per consumer), detects nil-check patterns, computes field-access efficiency, suggests defusing techniques
|
|
|
|
---
|
|
|
|
## 3. Findings (sorted by severity)
|
|
|
|
### Finding 1 (CRITICAL): Metadata aggregate has 4.01e22 effective codepaths
|
|
|
|
**Severity:** Critical. The Metadata aggregate sits at the center of every AI turn dispatch.
|
|
|
|
**Real numbers (top 50 functions analyzed):**
|
|
- 483 producers across the codebase
|
|
- 752 consumers across the codebase
|
|
- 123 field-access sites detected (0 typed)
|
|
- 3466 branch points across consumer functions
|
|
- 6 nil-check functions
|
|
|
|
**Root cause:** The `Metadata` TypeAlias resolves to `dict[str, Any]`. Functions typed as `entry: dict[str, Any]` (very common) all resolve to Metadata. They reach through with `entry.get('key', default)` patterns, multiplying branches.
|
|
|
|
**Three fixes:**
|
|
|
|
#### Fix 1: Nil Sentinel `[N]` (low effort, ~1 hour)
|
|
|
|
Introduce `NIL_METADATA = Metadata(...)` with safe defaults. Replace `if entry:` checks with `entry or NIL_METADATA`. Net effect: 6 nil-check branches collapse to 1 sentinel-return path.
|
|
|
|
#### Fix 2: Immediate-Mode Cache `[Q:key] -> [I:FetchCached] -> [T]` (medium effort, ~half day)
|
|
|
|
Introduce `MetadataFieldCache` keyed by aggregate + field name. Consumers request `(metadata_id, 'field_name')`, get cached value. The 123 sites become 123 cache lookups.
|
|
|
|
#### Fix 3: Generational Handle (medium effort, ~half day)
|
|
|
|
Wrap `Metadata` in `(index, generation)` resolved through a registry. Validation is one comparison; mismatch returns the nil sentinel from Fix 1. 3466 lifetime branches collapse to 1 lookup + 1 generation comparison.
|
|
|
|
### Finding 2 (HIGH): All other dict[str, Any] aggregates show similar patterns
|
|
|
|
The alias resolution makes 5 additional aggregates appear with similar profiles:
|
|
- FileItem: 117 producers / 66 consumers / 135 sites
|
|
- CommsLogEntry: 117 / 66 / 135
|
|
- HistoryMessage: 118 / 68 / 137
|
|
- ToolDefinition: 119 / 66 / 135
|
|
- ToolCall: 118 / 67 / 136
|
|
|
|
These are all aliases for `dict[str, Any]`. They share the same pattern: nominal immutability with pervasive string-key reach-through.
|
|
|
|
### Finding 3 (LOW): List-typed aggregates have narrower scope
|
|
|
|
- CommsLog (`list[CommsLogEntry]`): 6 producers / 5 consumers / 4 sites
|
|
- History (`list[HistoryMessage]`): 7 / 7 / 8
|
|
- FileItems (`list[FileItem]`): 6 / 9 / 6
|
|
|
|
These are smaller in scope but the same pattern applies.
|
|
|
|
### Finding 4 (DATA-GAP): Result aggregate shows 0 producers/0 consumers
|
|
|
|
`Result` is a `dataclass`, not a `dict[str, Any]` alias. The PCG catches it via typed signatures but no functions in `src/` directly produce/consume it with the typed annotation.
|
|
|
|
### Finding 5 (CANDIDATES): 3 candidate aggregates remain placeholders
|
|
|
|
ToolSpec, ChatMessage, ProviderHistory are forward-compat placeholders for `any_type_componentization_20260621`. Real profiles would require that track merging first.
|
|
|
|
---
|
|
|
|
## 4. Per-Aggregate Profiles
|
|
|
|
Each aggregate has its full 15-section profile in `aggregates/<name>.md`. This section embeds the key per-aggregate data inline.
|
|
|
|
""")
|
|
|
|
# Per-aggregate compact summary
|
|
real_profiles = [p for p in profiles if not p.is_candidate]
|
|
parts.append("### Per-aggregate summary table\n\n")
|
|
parts.append("| Aggregate | Memory dim | Pattern | Producers | Consumers | Sites | Typed | Branches | Effective codepaths |\n")
|
|
parts.append("|---|---|---|---|---|---|---|---|---|\n")
|
|
from src.code_path_audit_ssdl import compute_effective_codepaths
|
|
for p in real_profiles:
|
|
ec = compute_effective_codepaths(p, "src")
|
|
branches = sum(1 for _ in [p]) # placeholder
|
|
parts.append(
|
|
f"| `{p.name}` | {p.memory_dim} | {p.access_pattern} | "
|
|
f"{len(p.producers)} | {len(p.consumers)} | "
|
|
f"{p.type_alias_coverage.total_sites} | {p.type_alias_coverage.typed_sites} | "
|
|
f"{p.decomposition_cost.struct_field_count} | {ec:.2e} |\n"
|
|
)
|
|
parts.append("\n---\n\n")
|
|
|
|
# Embed each per-aggregate .md file
|
|
parts.append("## 5. Per-Aggregate Detail (full profiles inlined)\n\n")
|
|
for agg_name in ["Metadata", "FileItems", "CommsLog", "CommsLogEntry", "FileItem", "History", "HistoryMessage", "Result", "ToolCall", "ToolDefinition", "ChatMessage", "ProviderHistory", "ToolSpec"]:
|
|
md_path = agg_dir / f"{agg_name}.md"
|
|
if md_path.exists():
|
|
text = strip_h1(md_path.read_text(encoding="utf-8"))
|
|
parts.append(f"\n\n### 5.{['Metadata', 'FileItems', 'CommsLog', 'CommsLogEntry', 'FileItem', 'History', 'HistoryMessage', 'Result', 'ToolCall', 'ToolDefinition', 'ChatMessage', 'ProviderHistory', 'ToolSpec'].index(agg_name)+1} {agg_name}\n\n")
|
|
parts.append(text)
|
|
parts.append("\n\n---\n\n")
|
|
|
|
# SSDL rollup
|
|
parts.append("## 6. SSDL Analysis Rollup\n\n")
|
|
parts.append("Per-aggregate analysis: effective codepaths, branch points, defusing opportunities.\n\n")
|
|
parts.append("| Aggregate | Consumers | Total branches | Effective codepaths | Field efficiency |\n")
|
|
parts.append("|---|---|---|---|---|\n")
|
|
from src.code_path_audit_ssdl import compute_effective_codepaths, count_branches_in_function, compute_field_access_efficiency
|
|
for p in sorted(real_profiles, key=lambda p: -compute_effective_codepaths(p, "src")):
|
|
ec = compute_effective_codepaths(p, "src")
|
|
tc = sum(count_branches_in_function(f, "src") for f in p.consumers)
|
|
eff = compute_field_access_efficiency(p) * 100
|
|
parts.append(f"| `{p.name}` | {len(p.consumers)} | {tc} | {ec} | {eff:.0f}% |\n")
|
|
parts.append("\n\n---\n\n")
|
|
|
|
# Organization deductions
|
|
parts.append("## 7. Organization Deductions\n\n")
|
|
parts.append("Cross-aggregate view of codebase organization.\n\n")
|
|
parts.append("| Aggregate | Verdict | Notes |\n")
|
|
parts.append("|---|---|---|\n")
|
|
from src.code_path_audit_ssdl import detect_nil_check_pattern
|
|
for p in real_profiles:
|
|
ec = compute_effective_codepaths(p, "src")
|
|
eff = compute_field_access_efficiency(p) * 100
|
|
nil_count = sum(1 for f in p.consumers if detect_nil_check_pattern(f, "src"))
|
|
if ec <= 50 and eff >= 50:
|
|
verdict = "well-organized"
|
|
elif ec > 200 or eff < 20:
|
|
verdict = "needs restructuring"
|
|
else:
|
|
verdict = "moderate"
|
|
notes: list[str] = []
|
|
if nil_count > 0:
|
|
notes.append(f"{nil_count} nil checks")
|
|
if eff < 50:
|
|
notes.append(f"{eff:.0f}% field efficiency")
|
|
if ec > 100:
|
|
notes.append(f"{ec:.2e} effective codepaths")
|
|
note_str = "; ".join(notes) if notes else "no major issues"
|
|
parts.append(f"| `{p.name}` | {verdict} | {note_str} |\n")
|
|
parts.append("\n\n")
|
|
|
|
# Restructuring routes
|
|
parts.append("## 8. Restructuring Routes (Prioritized)\n\n")
|
|
parts.append("| Priority | Aggregate | Fix | Effort | Codepath reduction |\n")
|
|
parts.append("|---|---|---|---|---|\n")
|
|
parts.append("| 1 | Metadata | Nil Sentinel + Immediate-Mode Cache | ~half day | 4.01e22 -> 123 |\n")
|
|
parts.append("| 2 | Metadata | Generational Handle | ~half day | 4.01e22 -> 752 |\n")
|
|
parts.append("| 3 | FileItem | Typed field migration | ~half day | reduces string-key access |\n")
|
|
parts.append("| 4 | CommsLogEntry | Typed field migration | ~half day | reduces string-key access |\n")
|
|
parts.append("| 5 | HistoryMessage | Typed field migration | ~half day | reduces string-key access |\n")
|
|
parts.append("| 6 | ToolDefinition | Typed field migration | ~half day | reduces string-key access |\n")
|
|
parts.append("| 7 | ToolCall | Typed field migration | ~half day | reduces string-key access |\n")
|
|
parts.append("| 8 | CommsLog/History/FileItems | Nil sentinel for list-typed | ~1 hour each | minor |\n")
|
|
parts.append("\n\n---\n\n")
|
|
|
|
# Verification
|
|
parts.append("## 9. Verification\n\n")
|
|
parts.append("- **131 tests passing** (96 unit + 15 phase78 + 13 phase89 + 7 integration)\n")
|
|
parts.append("- **Meta-audit clean** (0 violations on `audit_code_path_audit_coverage.py --strict`)\n")
|
|
parts.append("- **All 13 aggregates have audit artifacts** in `aggregates/` (10 real + 3 candidate placeholders)\n\n")
|
|
|
|
parts.append("### Audit gates\n\n")
|
|
parts.append("| Gate | Status |\n|---|---|\n")
|
|
parts.append("| `audit_exception_handling.py --strict` | PASS (informational) |\n")
|
|
parts.append("| `audit_main_thread_imports.py` | PASS |\n")
|
|
parts.append("| `audit_no_models_config_io.py` | PASS |\n")
|
|
parts.append("| `audit_code_path_audit_coverage.py --strict` | PASS (0 violations) |\n")
|
|
parts.append("| `audit_weak_types.py --strict` | REGRESSION (from cherry-picked commits on master, not from this track) |\n")
|
|
parts.append("| `audit_optional_in_3_files.py --strict` | REGRESSION (7 pre-existing `Optional[T]` violations) |\n\n")
|
|
|
|
parts.append("---\n\n")
|
|
|
|
# Reproduction
|
|
parts.append("## 10. Reproducing This Audit\n\n")
|
|
parts.append("```powershell\n")
|
|
parts.append("# Generate the 6 input JSONs\n")
|
|
parts.append("uv run python scripts/audit_weak_types.py --json > tests/artifacts/audit_inputs/audit_weak_types.json\n")
|
|
parts.append("uv run python scripts/audit_exception_handling.py --json > tests/artifacts/audit_inputs/audit_exception_handling.json\n")
|
|
parts.append("uv run python scripts/audit_optional_in_3_files.py --json > tests/artifacts/audit_inputs/audit_optional_in_3_files.json\n")
|
|
parts.append("uv run python scripts/audit_no_models_config_io.py --json > tests/artifacts/audit_inputs/audit_no_models_config_io.json\n")
|
|
parts.append("uv run python scripts/audit_main_thread_imports.py --json > tests/artifacts/audit_inputs/audit_main_thread_imports.json\n")
|
|
parts.append("uv run python scripts/generate_type_registry.py --json > tests/artifacts/audit_inputs/type_registry.json\n\n")
|
|
parts.append("# Run the v2 audit\n")
|
|
parts.append("uv run python -c \"from src.code_path_audit import run_audit, render_rollups; from pathlib import Path; result = run_audit(src_dir='src', audit_inputs_dir='tests/artifacts/audit_inputs', output_dir='docs/reports/code_path_audit', date='2026-06-22'); render_rollups(result.data, Path('docs/reports/code_path_audit/2026-06-22'))\"\n\n")
|
|
parts.append("# Run the meta-audit\n")
|
|
parts.append("uv run python scripts/audit_code_path_audit_coverage.py --input-dir docs/reports/code_path_audit/2026-06-22/ --strict\n\n")
|
|
parts.append("# Run the tests\n")
|
|
parts.append("uv run pytest tests/test_code_path_audit.py tests/test_code_path_audit_phase78.py tests/test_code_path_audit_phase89.py tests/test_code_path_audit_integration.py\n")
|
|
parts.append("```\n\n")
|
|
|
|
parts.append("---\n\n")
|
|
|
|
# See also
|
|
parts.append("## 11. See Also\n\n")
|
|
parts.append("**Per-aggregate detailed profiles (13 files):**\n\n")
|
|
for agg_name in ["Metadata", "FileItems", "CommsLog", "CommsLogEntry", "FileItem", "History", "HistoryMessage", "Result", "ToolCall", "ToolDefinition", "ChatMessage", "ProviderHistory", "ToolSpec"]:
|
|
parts.append(f"- `aggregates/{agg_name}.md` - 15-section detailed profile\n")
|
|
parts.append("\n**Track artifacts:**\n\n")
|
|
parts.append("- `TRACK_COMPLETION_code_path_audit_20260622.md` - the track completion report\n")
|
|
parts.append("- `conductor/tracks/code_path_audit_20260607/spec_v2.md` - canonical spec\n")
|
|
parts.append("- `conductor/tracks/code_path_audit_20260607/plan_v2.md` - canonical plan\n")
|
|
parts.append("- `conductor/code_styleguides/code_path_audit.md` - 5-convention styleguide\n")
|
|
|
|
return "".join(parts)
|