The 7 code_path_audit*.py files (2604 lines total) are pure static analysis tools. They do AST traversal of src/, no intrusive profiling, no runtime markers. They were inlaid with src/ but only import: - src.result_types (the Result[T] convention type) - each other (the 6 siblings) After the move: - src/ is now pure application code; line-count audit metrics are clean - scripts/code_path_audit/ is a new namespace-isolated subdir per AGENTS.md 'scripts are namespace-isolated by directory' rule TIER-3 READ AGENTS.md + conductor/workflow.md + conductor/edit_workflow.md + conductor/code_styleguides/code_path_audit.md + the 7 files before this commit. Changes: - 7 files moved: src/code_path_audit*.py -> scripts/code_path_audit/ - 7 files updated: internal imports rom src.code_path_audit_X -> rom code_path_audit_X (siblings in same subdir) - 7 files updated: add sys.path.insert(0, str(Path(__file__).resolve().parents[2] / 'src')) to find src.result_types when run standalone - 5 test files updated: rom src.code_path_audit -> rom code_path_audit + sys.path setup to find the new subdir - 6 throwaway scripts in scripts/tier2/artifacts/ updated: import path + sys.path setup (parents[3] / 'src' + parents[3] / 'scripts' / 'code_path_audit') - 2 styleguide/spec references updated: conductor/code_styleguides/code_path_audit.md + conductor/tracks/code_path_audit_20260607/spec_v2.md - 1 meta-audit docstring updated: scripts/audit_code_path_audit_coverage.py - 1 type registry entry deleted: docs/type_registry/src_code_path_audit.md (the type is no longer in src/) - 1 type registry index updated: docs/type_registry/index.md (22 files, was 23) Verification: - 7/7 audit gates pass --strict (weak_types 102<=112, type_registry 22 files, main_thread_imports OK, no_models_config_io OK, code_path_audit_coverage 0 violations, exception_handling 0 violations, optional_in_3_files 0 violations) - 6/6 test files pass: test_code_path_audit, test_code_path_audit_integration, test_code_path_audit_phase78, test_code_path_audit_phase89, test_code_path_audit_ssdl_behavioral, test_metadata_nil_sentinel - src/ line count: 29997 lines (down from 32621 = -2624 lines) - scripts/code_path_audit/ line count: 2620 lines
5.7 KiB
Code Path & Data Pipeline Audit Styleguide
Status: Active convention as of 2026-06-22. Established by the
code_path_audit_20260607v2 track.
This styleguide codifies the contract for scripts/code_path_audit/code_path_audit.py v2 and the 6 input audit scripts it consumes. Companion to data_oriented_design.md, error_handling.md, type_aliases.md, and agent_memory_dimensions.md.
The 5 Conventions
1. Per-aggregate profile structure
Every AggregateProfile (the central artifact) has 15 fields (14 required + 1 default): name, aggregate_kind, memory_dim, producers, consumers, access_pattern, access_pattern_evidence, frequency, frequency_evidence, result_coverage, type_alias_coverage, cross_audit_findings, decomposition_cost, optimization_candidates, is_candidate (plus mermaid and markdown with defaults). The is_candidate: bool flag distinguishes the 3 placeholder aggregates (ToolSpec, ChatMessage, ProviderHistory) from the 10 real aggregates.
The custom postfix .dsl output is the canonical artifact: each section is a self-contained tagged record (flat, streamable, tag-scannable). The 14 new v2 DSL words: kind, mem-dim, fn-ref, access-pattern, ap-evidence, frequency, freq-evidence, result-coverage, type-alias-coverage, cross-audit-finding, cross-audit-findings, decomp-cost, opt-candidate, is-candidate. Arity table in scripts/code_path_audit/code_path_audit.py:DSL_WORD_ARITY_V2.
2. The 4 decomposition directions
For each aggregate, the audit computes a DecompositionCost (8 fields: current_cost_estimate, componentize_savings, unify_savings, recommended_direction, recommended_rationale, batch_size, struct_field_count, struct_frozen). The recommended_direction is one of:
componentize- split into smaller dataclasses; access pattern isfield_by_fieldwith many dead fields, ORhot_cold_splitwith small hot fields.unify- combine into wider fat structs; access pattern isbulk_batchedwith a small struct, ORwhole_structwith a small struct.hold- current shape is correct; default forfrozen + whole_struct(the ideal shape).insufficient_data- access pattern ismixedor frequency isunknown; needs runtime profiling per pipeline.
The 4-direction logic is in scripts/code_path_audit/code_path_audit.py:recommended_direction(). The savings estimates are heuristic (calibrated by pipeline_runtime_profiling_20260607); use as ranking input, not as actual savings.
3. The override file format
scripts/code_path_audit_overrides.toml (TOML) lets the user adjust per-aggregate. Sections:
[memory_dim]
"Metadata" = "curation"
[frequency]
"src.cleanup.do_nothing" = "cold"
The file is optional. Missing file = empty overrides (the canonical mappings + heuristics apply).
4. The 4 mem dim classification rules
MemoryDim is a 7-value Literal: curation, discussion, rag, knowledge, config, control, unknown. The classification precedence (per scripts/code_path_audit/code_path_audit.py:classify_memory_dim()): overrides > canonical mappings > file-of-origin heuristic > unknown.
curation: per-file structural (FileItem, FileItems, ContextPreset).discussion: per-turn conversational (Metadata, CommsLog, History, ChatMessage).rag: opt-in semantic (RAGEngine state, indexed chunks).knowledge: per-project durable (knowledge category files, digest).config: project / global config (manual_slop.toml, presets.toml, personas.toml).control: propagation primitives (Result[T], ErrorInfo, WebSocketMessage, ToolSpec, NormalizedResponse).unknown: the audit can't classify; flagged for human review.
5. The cross-audit integration contract
The v2 audit consumes JSON from 6 input sources (in tests/artifacts/audit_inputs/):
| Input | Producer | Shape |
|---|---|---|
audit_weak_types.json |
scripts/audit_weak_types.py --json |
{"findings": [{"file", "line", "type_string", "category"}]} |
audit_exception_handling.json |
scripts/audit_exception_handling.py --json |
{"findings": [{"file", "line", "category", "function", "class", "body_summary"}]} |
audit_optional_in_3_files.json |
scripts/audit_optional_in_3_files.py --json |
{"findings": [{"file", "line", "return_type", "function"}]} |
audit_no_models_config_io.json |
scripts/audit_no_models_config_io.py --json |
{"findings": [{"file", "line", "function", "config_path"}]} |
audit_main_thread_imports.json |
scripts/audit_main_thread_imports.py --json |
{"findings": [{"file", "line", "imported_module", "thread"}]} |
type_registry.json |
scripts/generate_type_registry.py --json |
{"types": {"<aggregate>": {"file", "fields": [{"name", "type", "optional"}]}}} |
Tolerance: if any input is missing or malformed, the audit continues with the corresponding cross_audit_findings field set to () and the markdown notes the missing input. The audit does NOT fail on missing inputs.
The finding-to-aggregate mapping is 3-tier: tier 1 (function lookup) > tier 2 (field lookup via type registry) > tier 3 (heuristic fallback by file-of-origin). Each finding gets a (aggregate, confidence, mapping_tier) triple.
See Also
conductor/tracks/code_path_audit_20260607/spec_v2.md- the canonical specconductor/tracks/code_path_audit_20260607/plan_v2.md- the canonical planconductor/code_styleguides/data_oriented_design.md- the canonical DOD referenceconductor/code_styleguides/error_handling.md- theResult[T]conventionconductor/code_styleguides/type_aliases.md- the 10 TypeAliases + 1 NamedTupleconductor/code_styleguides/agent_memory_dimensions.md- the 4 mem dims