Three real bugs fixed:
1. FunctionRef always used line=0. Now passes node.lineno from AST.
2. P3_pass results were discarded with bare pass. Now stored in
ProducerConsumerGraph.field_accesses.
3. Field-access detector only saw entry['key']; missed entry.get('key')
which is the dominant pattern in this codebase. Now handles both.
Plus _extract_type_name() helper handles Optional[T], dict[str, T],
list[T], Result[T], Union[T, ...], and T | None (PEP 604) so P1/P2
catch more annotation patterns.
Real numbers (Metadata aggregate):
- producers: 77 -> 117
- consumers: 35 -> 66
- field-access sites: 130 -> 173
- line numbers: all real (line 1281, 1746, etc.)
AUDIT_REPORT.md grew 2009 -> 3140 lines with real evidence.
Total audit output: 5176 lines / 50 files (was 2415 / 49).
All 131 tests still passing.
The 272-line report was a summary, not a report. The user wanted
the actual evidence inlined. This version embeds:
- Full per-aggregate .md profiles (15 sections each)
- Full SSDL analysis rollup
- Full organization deductions
- Full call graph
- Full hot paths
- Full field usage
- Full decomposition matrix
- Full cross-audit summary
- Full dead fields
- Full candidates
- Full top-level summary
Total: 2009 lines. The user can read it as a single document or
grep for specific aggregates/sections.
The audit output is a database dump (49 files, 3 redundant formats
each). The user wanted ONE thing they can read. This is the
narrative version: 1 file that opens with the verdict, walks
through findings by severity, gives the Metadata deep dive, and
ends with prioritized restructuring routes.
Original 49 files (10 top-level rollups + 13 aggregates x 3 formats)
preserved as supporting detail. See Section 10 'See Also' for
the full artifact inventory.
Replaces passive 'what we shipped' framing with active 'what the
audit tells us about the codebase organization' deductions.
Headline finding: 0 of 10 real aggregates are well-organized.
Metadata aggregate has 1.13e18 effective codepaths (2^251 from
251 branch points across 35 consumers), 6 nil-check functions,
and 0% field-access efficiency. Three concrete refactor routes:
nil sentinel [N], generational handles, immediate-mode cache.
Replaces the prior TRACK_COMPLETION (which was written before the
real-data analyzers landed). Documents the 4 new analyzer modules,
the 2136-line output report, the per-aggregate table with real
producer/consumer counts, the audit gates status, the known
gaps, and the 5 follow-up tracks.
Total report now exceeds the 2k-line threshold the user asked
for (2136 lines of audit content + this 200-line summary).
The previous code did Path(src_dir) / function_ref.file, which
double-prefixed (e.g. src/src/project_manager.py) and silently
returned empty. Fixed: if function_ref.file exists as
CWD-relative, use it directly. Only join if it doesn't exist.
Now 130 real field accesses detected across 35 Metadata consumers
in the 2026-06-22 audit output (was 0 before).
The aggregate_findings function now does 3-tier mapping:
1. Function lookup (find_enclosing_function) -> exact match
2. File-level fallback: if the finding's file has any
producer/consumer of the aggregate, bucket it there
3. Unbucketed (the file has no aggregate refs)
Handles both 'file' and 'filename' keys (v1 audit scripts use
'filename'; spec fixtures use 'file'). Path normalization
for Windows paths.
Generated the 6 real audit_inputs from scripts/audit_*.py
against real src/. The Metadata aggregate now shows:
- 1 unique weak_types finding (1 site, from ai_client.py:159)
- 1 unique exception_handling finding (76 sites from PARAM_OPTIONAL)
mcp_client.py shows 0 because no Metadata producer/consumer
exists in the PCG for mcp_client (P1/P2 only detect typed
parameter signatures, not internal field access). The next
gap is expanding P3 to capture internal field use.
Loops over audit_weak_types + audit_exception_handling from
the 6 audit_inputs, calls aggregate_cross_audit_findings per
audit, sums the buckets per profile.
Cross-audit aggregation is per-aggregate-flat (all findings go
into 1 bucket per audit). The 3-tier finding-to-aggregate
mapping (find_enclosing_function + type registry + file
heuristic) is the next gap - requires per-finding site
classification.
Previous version checked for field names (weak_types, etc.)
in DSL content. That's wrong - those are bucket names that
only appear when there are findings. New version just checks
the 14 required section markers + the cross-audit-findings
count line. Skips candidate aggregates.
Meta-audit now passes clean on the 2026-06-22 audit output.
The Optional[T] ban enforcement script. Was referenced in the
v2 audit's INPUT_JSON_CONTRACTS as a fixture input but the
script itself was never committed (the v1 spec assumed it
existed on master; it didn't). This commit CREATES the
script from scratch per the v2 audit's contract.
Baseline files (4 total):
- src/mcp_client.py (refactored 2026-06-06)
- src/ai_client.py (refactored 2026-06-06)
- src/rag_engine.py (refactored 2026-06-06)
- src/code_path_audit.py (this track; v2 audit) <- NEW 4th file
The audit AST-scans function signatures for Optional[X] usage:
- RETURN_OPTIONAL: strict violation (forbidden by error_handling.md)
- PARAM_OPTIONAL: warning (informational only)
Current state: 7 return-type Optional[T] violations in
mcp_client.py + ai_client.py (pre-existing from the v1
refactor; NOT introduced by code_path_audit.py). My new
file passes clean.
--strict mode exits 1 on any RETURN_OPTIONAL violation.
Default mode prints the report and exits 0.
The end-of-track report. 131 tests + 4 audit gates + meta-audit
+ type registry all pass (with 2 known issues documented).
The 3 candidate aggregates are forward-compat placeholders
that became real via 6 cherry-picks during this session.
5 follow-up tracks recorded.
13 aggregate profiles (10 real + 3 candidate placeholders)
+ 4 top-level rollups. Per the spec, the 3 candidate
aggregates (ToolSpec, ChatMessage, ProviderHistory) are
forward-compat placeholders for any_type_componentization_20260621
(NOT on master); the audit's report includes them with
is_candidate: True.
Replaces the Phase 0 stub. Documents the per-aggregate profile
structure, the 4 decomposition directions, the override file
format, the 4 mem dim classification rules, and the 6-input
cross-audit integration contract.
Schema validator for the v2 audit's output. Verifies all 14
required profile sections, all 5 cross-audit fields, all 8
decomposition_cost fields. Per feature_flags.md 'delete to
turn off' pattern.
Phase 1 data model: 19 unit tests passing. The 5 enums + 9
supporting dataclasses + AggregateProfile central artifact are
all in place. Phase 1 checkpoint at ef207cf6.
fqname, file, line, role. Used in ProducerConsumerGraph edges
and per-aggregate producer/consumer lists. Per error_handling.md
Pattern 1 (immutability for cross-thread safety).
2 unit tests passing.
Phase 0 of any_type_componentization_20260621. Extends src/type_aliases.py
with two recursive-friendly TypeAliases for JSON wire format (used by
Phase 5 api_hooks WebSocketMessage):
- JsonPrimitive: str | int | float | bool | None
- JsonValue: JsonPrimitive | list['JsonValue'] | dict[str, 'JsonValue']
The forward-ref 'JsonValue' strings work because from __future__ import
annotations is at the top of the module (PEP 563 + PEP 613 TypeAlias).
Tests added (4 new, 14 total):
- test_json_primitive_alias_resolves_to_union: hints exposes JsonPrimitive
- test_json_value_alias_resolves_to_recursive_union: hints exposes JsonValue
- test_json_value_accepts_primitive_dict: dict[str, JsonValue] runtime use
- test_json_value_accepts_nested_structures: nested dict+list round-trip
Verification:
uv run pytest tests/test_type_aliases.py --timeout=30
14 passed in 2.97s
AggregateKind (4 values), MemoryDim (7), AccessPattern (5),
Frequency (7), RecommendedDirection (4). All Literal types
for stable postfix DSL output (string-valued, no enum-name
lookup table needed in the parser).
5 unit tests passing. The 9 supporting dataclasses + the
AggregateProfile central artifact go in Tasks 1.2-1.10.