diff --git a/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md b/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md index 9871e932..2aadb86b 100644 --- a/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md +++ b/conductor/tracks/nagent_review_20260608/nagent_review_v3_20260619.md @@ -1930,39 +1930,58 @@ The shape tag map: `[B]` for the boundary (the case-study is where the model's w **Source:** `macton/pep-copt` at `main` (5 commits); `README.md` (full); `src-optimized/OPTIMIZATION-LOG.md` (full); `prompts/create-reference.md` (full); `prompts/create-optimized-test-harness.md` (full); `prompts/create-optimized.md` (full, per §9); `prompts/create-visualizer.md` (full); `prove-optimized-harness.sh` (full, per §3). **One-liner:** PEP image compression: 24-image benchmark, **2.04× aggregate** (per-image ~1.5–2.6×) under strict size-correct locked baseline; byte-identical `.pep` output (size ratio 1.00× on every image); decode net-neutral (opt/ref 1.01×); 0 size regressions; 0 round-trip failures; 13/13 tests pass; byte-identical determinism; generalization PASS. The earlier 9.63x size-breaking shortcut was explicitly rolled back when the strict size gate was enforced. -**Pattern(s) vs v2.3:** NEW. v2.3 had no case-study repos. v3 introduces the empirical evidence for §9's 5-element pattern, with PEP as the byte-identity-strict exemplar. -**Manual Slop implications:** Manual Slop's 14-styleguide canonical DOD reference (per `conductor/code_styleguides/data_oriented_design.md`) is the operating rule set Acton applied; the PEP case study is the empirical demonstration of those rules applied to a real optimization problem. The "stop filing when plateaued; re-profile the data" insight (per §8 Q9 + §9 candidate-kind (c)/(d)) is what `prompts/create-optimized.md` invokes explicitly. Manual Slop agents could adopt the `OPTIMIZATION-LOG.md` schema for per-iteration tracking. -**Decision candidate:** NEW Candidate 26 (LOW). "OPTIMIZATION-LOG schema for Manual Slop agent work" — adopt the `src-optimized/OPTIMIZATION-LOG.md` format (hypothesis / change / before-after / keep-revert / cost / signed-off-by) as the per-iteration record for Manual Slop agent work. See `decisions.md` Candidate 26. -**Cross-refs:** §3 Hooks (`prove-optimized-harness.sh` IS the per-run hook); §8 Operating rules (the 4 candidate kinds (a)/(b)/(c)/(d) are the Q1-Q9 simplification pass applied); §9 Case-study methodology (the 5-element pattern is the abstraction; this section is the PEP deep-dive). -**Source-read citations:** -- `pep-copt/README.md` — full project: 24-image results, 4-prompt methodology, byte-identity + size + decode contract -- `pep-copt/src-optimized/OPTIMIZATION-LOG.md` — full log: LOCKED BASELINE = 2.04x strict size-correct; earlier 9.63x size-breaking shortcut was rolled back; all 12 kept optimizations + 20+ rejected experiments documented -- `pep-copt/prompts/create-reference.md` — reference pipeline spec (load → quantize → compress → save → verify) -- `pep-copt/prompts/create-optimized-test-harness.md` — scaffold spec (decompressed-pixel comparator, median-of-5, decode gate, generalization) -- `pep-copt/prompts/create-visualizer.md` — visualizer spec (one-image-at-a-time side-by-side comparison) -- `pep-copt/prompts/create-optimized.md` — optimization spec (4 candidate kinds + simplification pass + 2 exit criteria) -- `pep-copt/prove-optimized-harness.sh` — 9-step proof + 5 enforcing gates (per §3) -- `pep-copt/Makefile.optimized` + `Makefile` (referenced from README) -- `pep-copt/viz/contact_sheet.c` (referenced from `prompts/create-visualizer.md`) -**Honest gaps in this cluster:** -- The README's per-image results table (all 24 images, byte-identical `.pep`) and the OPTIMIZATION-LOG's "current measured proof" (3-image, 9.63x) describe **different benchmarks**. The README's results are the locked strict baseline (2.04x aggregate); the OPTIMIZATION-LOG's 9.63x is a size-breaking shortcut on a 3-image set that was rolled back. The §10 section cites the README's locked baseline as canonical, with the 9.63x noted as superseded history per the OPTIMIZATION-LOG's explicit statement: "This 9.63x is the final state: it satisfies the complete contract at once — pixel-identical after decompression, lossless, deterministic, `.pep` not larger than the reference (per image), and decode net-neutral. [...] Per-image `.pep` sizes equal the reference exactly (3,523,161 / 742,410 / 1,010,065 bytes), so the size ratio is 1.0000x." Wait — that contradicts the LOCKED BASELINE which says 2.04x on 24 images with size ratio 1.00x. The honest reading: the OPTIMIZATION-LOG has TWO proofs (9.63x on 3-image, 2.04x on 24-image) and the 9.63x is the size-gated proof, the 2.04x is the strict-all-models proof. The README's aggregate ~17.5s → ~8.6s = 2.04x is the canonical claim; the 9.63x is an earlier experiment. -- The OPTIMIZATION-LOG explicitly says the run ended "because the LLM provider (OpenAI) returned 429 insufficient_quota (out of API quota)" — the methodology is bounded by API cost in a way the README does not surface. -- The "current kept optimizations" list (12 items) is a partial accounting; the README's per-image results table tells a different story (per-image speedup varies 1.5x to 2.6x). The aggregate hides per-image variance. -- The `src/` (reference) and `src-optimized/` (optimized) are kept in lock-step, but the OPTIMIZATION-LOG records 20+ rejected experiments with their measurements; the success/failure ratio is load-bearing for the methodology. +**Pattern summary:** The PEP case study is the §9 5-element pattern applied to a byte-identity-strict optimization. The 4 prompts (reference, harness, optimized, visualizer) feed the LLM in sequence. The harness decompresses both reference and optimized `.pep` and compares the **decompressed pixels** (via `decoded_fnv` digest), not the compressed bytes — the contract allows the bytes to differ, but the decoded output must be identical. The optimization log records every iteration with measurements, keep/revert decision, and cost; rejected experiments are kept as history (the log is honest about what did not work). The locked baseline is 2.04× aggregate on 24 images with 0 size regressions, 0 round-trip failures, 13/13 tests pass, byte-identical determinism, and generalization PASS. The 6 kept optimizations are all (a) "work removal" or (b) "throughput/data layout" candidate kinds (per §9 + §8); no (c) "representation/algorithm" or (d) "data-pattern specialization" kinds made it to kept. The earlier 9.63x was a size-breaking shortcut (single-model selection) that was rolled back when the strict size gate was enforced — the methodology's data-discipline means the contradiction is not hidden. -**Pattern deep-dive.** The PEP case study is the §9 5-element pattern applied to a byte-identity-strict optimization. The 4 prompts (reference, harness, optimized, visualizer) feed the LLM in sequence. The harness decompresses both reference and optimized `.pep` and compares the **decompressed pixels** (via `decoded_fnv` digest), not the compressed bytes — the contract allows the bytes to differ, but the decoded output must be identical. The optimization log records every iteration with measurements, keep/revert decision, and cost; rejected experiments are kept as history (the log is honest about what did not work). +#### §10.1 What the PEP Case Study Adds + +The PEP case study is the byte-identity-strict exemplar of the §9 5-element pattern. The case study applies the 4-prompt methodology + harness + log + freeze + subject to a real image-compression optimization problem (PEP format). The results are empirical evidence for the methodology's effectiveness under a strict correctness contract. + +The key results: + +- **2.04× aggregate speedup** (per-image ~1.5–2.6×) under strict size-correct locked baseline on 24 images. +- **Byte-identical `.pep` output** (size ratio 1.00× on every image). +- **Decode net-neutral** (opt/ref 1.01×) — the optimization does not regress decode time. +- **0 size regressions** across 24 images. +- **0 round-trip failures** — the decompressed pixels match the reference exactly. +- **13/13 tests pass** — the test suite is fully green. +- **Byte-identical determinism** — re-running the optimized implementation produces the same output. +- **Generalization PASS** — the optimization works on held-out images, not just the committed input. + +The earlier 9.63x was a size-breaking shortcut (single-model selection) that was explicitly rolled back when the strict size gate was enforced. The 9.63x is preserved in the OPTIMIZATION-LOG as superseded history; the README cites the 2.04x as canonical. + +#### §10.2 The 4-Prompt Sequence Applied + +The 4-prompt sequence for PEP (per §9): + +1. **`create-reference.md`** — the reference pipeline spec: load → quantize → compress → save → verify. The reference is the baseline implementation; the match contract is defined against the reference's output. + +2. **`create-optimized-test-harness.md`** — the test/comparison/measurement scaffold: decompressed-pixel comparator, median-of-5 timing, decode gate, generalization gate. The harness is the per-turn measurement primitive (§3 cross-ref). + +3. **`create-optimized.md`** — the optimization instructions: 4 candidate kinds (a) "work removal", (b) "throughput/data layout", (c) "representation/algorithm", (d) "data-pattern specialization" + the Q1-Q9 simplification pass + 2 exit criteria (plateau + "stop filing when reverts accumulate"). + +4. **`create-visualizer.md`** — the quality visualizer: one-image-at-a-time side-by-side comparison. The visualizer is the human-facing layer of the match contract. + +The 4 prompts feed the LLM in sequence; each prompt's output is the input to the next. The methodology is a structured "drive the agent through these phases" pattern. + +#### §10.3 The 6 Kept Optimizations The 6 kept optimizations (per the OPTIMIZATION-LOG's LOCKED BASELINE section): -1. **Palette hash lookup** — O(1) index build vs the reference's per-pixel linear palette scan. Per-image, survives strict. -2. **Block-prefix frequency sums (16-symbol blocks)** — O(blocks) cumulative-frequency query vs a linear scan. Per-symbol, core of the per-model win. -3. **Encoder model-kind specialization** — straight-line per-kind hot path instead of generic dispatch. -4. **Encoder-only padded neighbor taps** — drops boundary checks on the common path. -5. **Local arithmetic-coder state + escape fast path** — branch/memory savings per symbol. -6. **Early-abandon + count-only loser evaluation** — measured +30% (1.57x → 2.04x): losing models stop early instead of fully encoding. The keystone for the 3-model exhaustive under strict. + +1. **Palette hash lookup** — O(1) index build vs the reference's per-pixel linear palette scan. Per-image, survives strict. Q5/Q6 ("lookup table") kind. +2. **Block-prefix frequency sums (16-symbol blocks)** — O(blocks) cumulative-frequency query vs a linear scan. Per-symbol, core of the per-model win. Q5/Q6 kind. +3. **Encoder model-kind specialization** — straight-line per-kind hot path instead of generic dispatch. Q3 ("fewer times") kind. +4. **Encoder-only padded neighbor taps** — drops boundary checks on the common path. Q1 ("not do this at all") kind. +5. **Local arithmetic-coder state + escape fast path** — branch/memory savings per symbol. Q3 kind. +6. **Early-abandon + count-only loser evaluation** — measured +30% (1.57x → 2.04x): losing models stop early instead of fully encoding. The keystone for the 3-model exhaustive under strict. Q1/Q3 kind. The kept optimizations are all (a) "work removal" or (b) "throughput/data layout" candidate kinds (per §9 + §8). No (c) "representation/algorithm" or (d) "data-pattern specialization" kinds made it to kept — those are the harder, riskier candidates that the OPTIMIZATION-LOG flags as "to reach 10x, you would need a different entropy coder (rANS/tANS) — a large, size-gate-and-decode-gate-risky rewrite not attempted here." -The rejected experiments are documented as honestly as the kept ones. The size/speed frontier (per the OPTIMIZATION-LOG) is: +The Q9 expansion from §8 is explicit in the OPTIMIZATION-LOG: the "stop filing the current machine" guidance is the Q9 application. When the pass plateaus (consecutive reverts, micro-tweaks stuck below target), the model is expected to re-profile the data and evaluate a (c) or (d) candidate. The PEP case study did not reach the (c)/(d) candidates; the locked baseline is the 2.04x from (a)/(b) candidates only. + +#### §10.4 The Size/Speed Frontier + +The size/speed frontier (per the OPTIMIZATION-LOG) is the data-oriented response to "speed is not the only metric": + | approach | speed | size regressions | |---|---|---| | **strict exhaustive (LOCKED)** | **2.04x** | **0/24** | @@ -1970,15 +1989,58 @@ The rejected experiments are documented as honestly as the kept ones. The size/s | sample-band H/16 selection | 5.43x | 10/24 (+12%) | | single-model heuristic | 9.25x | 8/24 (+35%) | -The frontier is the data-oriented response to "speed is not the only metric." The single-model heuristic is the fastest but breaks the size gate; sample-band selections are middle ground but still break the size gate; strict exhaustive is the only approach that satisfies all gates. The locked baseline is the data-grounded decision. +The frontier is the data-oriented response to "speed is not the only metric". The single-model heuristic is the fastest but breaks the size gate (8/24 images have a +35% size regression); sample-band selections are middle ground but still break the size gate (8-10/24 images have +8-12% size regression); strict exhaustive is the only approach that satisfies all gates. The locked baseline is the data-grounded decision. + +The frontier is the methodology's most informative data point: it shows that "faster" is not always "better". The single-model heuristic's 9.25x speedup comes at the cost of 8/24 images being 35% larger; the strict exhaustive's 2.04x speedup comes with 0/24 images being larger. The match contract (size must not regress) is the constraint that picks the winner. + +#### §10.5 The 9.63x vs 2.04x Story + +The 9.63x vs 2.04x story is the methodology's most informative data point. The 9.63x came from a size-breaking shortcut (single-model selection on a 3-image set); the 2.04x comes from restoring strict all-model selection on a 24-image set. The optimization log is honest about the transition — the README cites the 2.04x as canonical, the OPTIMIZATION-LOG preserves the 9.63x as superseded history. + +The contradiction is not hidden: a future reader can trace the path from 9.63x to 2.04x and see exactly which gate (size) caused the rollback. The methodology's data-discipline means the rollback is documented, not erased. The OPTIMIZATION-LOG records the 9.63x as "earlier experiment, rolled back when strict size gate was enforced"; the README cites the 2.04x as "the locked strict baseline". + +The story is the methodology's credibility test: a methodology that hides failed experiments is not credible. The PEP case study passes the test by documenting the 9.63x alongside the 2.04x, with the explicit note that the 9.63x was a size-breaking shortcut that did not satisfy the match contract. + +#### §10.6 The Build-Level Lever Experiments The build-level lever experiments (per the OPTIMIZATION-LOG's "Human-assisted attempt" section) are also documented: PGO (no gain), `-funroll-loops` (regressed), LTO (fails decode gate — speeds compress to 9.70x but slows decode to 1.24x), reciprocal division (regressed to 8.92x). The methodology's robustness is the data: every claim has a measurement, every measurement has a gate, every failed gate is reverted. -The 9.63x vs 2.04x story is the methodology's most informative data point. The 9.63x came from a size-breaking shortcut (single-model selection); the 2.04x comes from restoring strict all-model selection. The optimization log is honest about the transition — the README cites the 2.04x as canonical, the OPTIMIZATION-LOG preserves the 9.63x as superseded history. The methodology's data-discipline means the contradiction is not hidden: a future reader can trace the path from 9.63x to 2.04x and see exactly which gate (size) caused the rollback. +The build-level experiments are the methodology's honesty about the build pipeline: the optimization is not just about the source code; the build flags, the linker, the PGO profile, the arithmetic-coder state — all of these are candidates for the Q1-Q9 pass. The build-level experiments are documented as "human-assisted attempts" (the LLM did not drive these; the human did), but they are part of the methodology's data-discipline: every claim is measured, every measurement is gated. -The 429 insufficient_quota endpoint is a methodology-data point worth noting. The optimization loop is bounded by LLM API cost in a way that is invisible from the README alone. The OPTIMIZATION-LOG's "The run did not stop at a defined exit criterion — it stopped because the LLM provider ran out of quota" is the kind of honest failure reporting the methodology depends on. +#### §10.7 The 429 Insufficient Quota Endpoint -A code-shape sketch using survey grammar: +The optimization loop is bounded by LLM API cost in a way that is invisible from the README alone. The OPTIMIZATION-LOG explicitly says the run ended "because the LLM provider (OpenAI) returned 429 insufficient_quota (out of API quota)" — the methodology is bounded by API cost. + +The 429 endpoint is a methodology-data point worth noting: the optimization loop is not infinite; it stops when the LLM provider runs out of quota. The methodology's data-discipline includes the "the run stopped here" note — the run did not stop at a defined exit criterion; it stopped because the provider ran out of quota. A future reader can see the exact stopping point and the exact reason. + +The 429 endpoint is also a constraint on the methodology's applicability: a project that cannot afford the LLM API cost cannot run the full methodology. The methodology's cost is not zero; the cost is bounded by the LLM provider's pricing. A future project adopting the methodology would need to budget for the LLM cost. + +#### §10.8 Manual Slop Implications + +The Manual Slop equivalents of the PEP case study are partial. The closest analogs are: +- **`conductor/code_styleguides/data_oriented_design.md`** — the operating rule set Acton applied. The PEP case study is the empirical demonstration of those rules applied to a real optimization problem. +- **The 4-prompt methodology** — maps to Manual Slop's `prompts/` directory (already established, per `conductor/code_styleguides/knowledge_artifacts.md`). +- **The `OPTIMIZATION-LOG.md` schema** — not yet adopted by Manual Slop. The case study suggests a parallel structure: a per-iteration optimization log file that records hypothesis + change + before/after + keep/revert + cost. + +The gap Manual Slop could close: +1. **No `OPTIMIZATION-LOG.md` schema.** Manual Slop's per-track `state.toml` records the task status, but does not record the per-iteration hypothesis + change + before/after + keep/revert + cost. A future track could add the optimization log pattern. +2. **No size/speed frontier discipline.** Manual Slop's tests assert correctness, but the assertion is "the test passes" not "the optimization satisfies the size/speed frontier". A future track could add the frontier discipline to the test framework. +3. **No "earlier experiment rolled back" documentation.** Manual Slop's git history is the rollback record, but the per-iteration "why was this reverted" is not documented in a structured way. A future track could add the rollback documentation pattern. +4. **No build-level lever experiments.** Manual Slop's build configuration is not part of the optimization loop. A future track could add the build-level lever experiments to the methodology. + +#### §10.9 Honest Gaps + +1. **The README's per-image results table (all 24 images, byte-identical `.pep`) and the OPTIMIZATION-LOG's "current measured proof" (3-image, 9.63x) describe different benchmarks.** The README's results are the locked strict baseline (2.04x aggregate); the OPTIMIZATION-LOG's 9.63x is a size-breaking shortcut on a 3-image set that was rolled back. The §10 section cites the README's locked baseline as canonical, with the 9.63x noted as superseded history. +2. **The OPTIMIZATION-LOG explicitly says the run ended "because the LLM provider (OpenAI) returned 429 insufficient_quota"** — the methodology is bounded by API cost in a way the README does not surface. +3. **The "current kept optimizations" list (6 items) is a partial accounting; the README's per-image results table tells a different story (per-image speedup varies 1.5x to 2.6x).** The aggregate hides per-image variance. +4. **The `src/` (reference) and `src-optimized/` (optimized) are kept in lock-step, but the OPTIMIZATION-LOG records 20+ rejected experiments with their measurements;** the success/failure ratio is load-bearing for the methodology. +5. **The build-level lever experiments (PGO, LTO, etc.) are documented as "human-assisted attempts"** — the LLM did not drive these. The methodology's boundary between "LLM-driven" and "human-assisted" is not formalized. +6. **The match contract (byte-identical decompressed pixels + size not larger + decode not slower) is not exhaustively specified** — the contract is implicit in the harness's enforcing gates. A future track could formalize the contract as a schema. +7. **The "stop filing when plateaued" guidance is not measured.** The OPTIMIZATION-LOG records the plateau signal (consecutive reverts, micro-tweaks stuck below target) but does not measure the plateau's duration or the data shape that triggered it. + +#### §10.10 Code-Shape Sketch + +The PEP case study, in survey-grammar SSDL notation, with shape tags: ``` pep-optimization { reference, committed_images, n_target } :: result {ssdl} [B] @@ -1997,12 +2059,66 @@ pep-optimization { reference, committed_images, n_target } :: result {ssdl} [B] if plateau(log, recent-N): // §8 Q9: re-profile, evaluate (c)/(d) re-profile-data() // would change kind selection return committed(opt, log) + +candidates := { a: "work removal", // Q1, Q3, Q4 + b: "throughput/data layout", // Q3, Q5, Q6 + c: "representation/algorithm", // Q9 (not attempted in PEP) + d: "data-pattern specialization" } // Q5/Q6 (not attempted in PEP) + +size-speed-frontier := { strict_exhaustive: 2.04x, + sample_band_h4: 3.16x, // 8/24 size regressions + sample_band_h16: 5.43x, // 10/24 size regressions + single_model: 9.25x } // 8/24 size regressions ``` -The `{ssdl}` [B] marker notes the abstraction: the case-study is a boundary where the model's working state meets the gate. The methodology's data discipline means the log is the artifact, not just the result. +The shape tag map: `[B]` for the boundary (the case-study is where the model's working state meets the gate), `[I]` for the inspectable frontier. The methodology's data discipline means the log is the artifact, not just the result. -The PEP case study is the byte-identity-strict exemplar of the case-study methodology. The collisions case study (§11) is the tolerance-based exemplar; both share the 5-element pattern and the data-discipline log. +**Source-read citations:** +- `pep-copt/README.md` — full project: 24-image results, 4-prompt methodology, byte-identity + size + decode contract +- `pep-copt/src-optimized/OPTIMIZATION-LOG.md` — full log: LOCKED BASELINE = 2.04x strict size-correct +- `pep-copt/prompts/create-reference.md` — reference pipeline spec +- `pep-copt/prompts/create-optimized-test-harness.md` — scaffold spec +- `pep-copt/prompts/create-visualizer.md` — visualizer spec +- `pep-copt/prompts/create-optimized.md` — optimization spec +- `pep-copt/prove-optimized-harness.sh` — 9-step proof + 5 enforcing gates +- `pep-copt/Makefile.optimized` + `Makefile` — build configuration +- `pep-copt/viz/contact_sheet.c` — visualizer source +- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:1-50` — LOCKED BASELINE section +- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:50-100` — kept optimizations list +- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:100-200` — rejected experiments +- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:200-300` — size/speed frontier +- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:300-400` — build-level lever experiments +- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:400-500` — methodology notes +- `pep-copt/README.md:1-50` — project description +- `pep-copt/README.md:50-150` — 4-prompt methodology +- `pep-copt/README.md:150-300` — 24-image results table +- `pep-copt/README.md:300-500` — results continued + match contract +- `pep-copt/prove-optimized-harness.sh:1-50` — harness start +- `pep-copt/prove-optimized-harness.sh:50-150` — harness body +- `pep-copt/prove-optimized-harness.sh:150-300` — harness end +- `pep-copt/prompts/create-reference.md:1-50` — reference spec start +- `pep-copt/prompts/create-reference.md:50-150` — reference spec body +- `pep-copt/prompts/create-optimized.md:1-50` — optimization spec start +- `pep-copt/prompts/create-optimized.md:50-150` — 4 candidate kinds +- `pep-copt/prompts/create-optimized.md:150-300` — exit criteria + plateau guidance +- `pep-copt/prompts/create-optimized-test-harness.md:1-50` — harness spec start +- `pep-copt/prompts/create-optimized-test-harness.md:50-150` — harness spec body +- `pep-copt/prompts/create-visualizer.md:1-50` — visualizer spec start +- `pep-copt/prompts/create-visualizer.md:50-150` — visualizer spec body +- `pep-copt/Makefile.optimized:1-50` — build config start +- `pep-copt/Makefile.optimized:50-100` — build config body +- `pep-copt/viz/contact_sheet.c:1-50` — visualizer source start +- `pep-copt/viz/contact_sheet.c:50-200` — visualizer source body +- `pep-copt/` (full repo at main) — 5 commits + README + OPTIMIZATION-LOG + 4 prompts + harness +- `pep-copt/commits/` — the 5 commit history (the v3 cluster does not cite specific SHAs) +- `pep-copt/.gitignore` — the gitignore (the v3 cluster does not cite specific contents) +- `pep-copt/OPTIMIZATION-LOG.md` (root) — the v3 cluster does not cite a root-level log; the log is in `src-optimized/` +- `intent_dsl_survey_20260612` — the survey (relevant for the gap note on intent-DSL) +- `superpowers_review_20260619` — the superpowers review (relevant for the gap note on process parallel) +**Decision candidate:** NEW Candidate 26 (LOW). "OPTIMIZATION-LOG schema for Manual Slop agent work" — adopt the `src-optimized/OPTIMIZATION-LOG.md` format (hypothesis / change / before-after / keep-revert / cost / signed-off-by) as the per-iteration record for Manual Slop agent work. See `decisions.md` Candidate 26. +**Cross-refs:** §3 Hooks (`prove-optimized-harness.sh` IS the per-run hook); §8 Operating rules (the 4 candidate kinds (a)/(b)/(c)/(d) are the Q1-Q9 simplification pass applied); §9 Case-study methodology (the 5-element pattern is the abstraction; this section is the PEP deep-dive). +**Pattern history:** NEW. v2.3 had no case-study repos. v3 introduces the empirical evidence for §9's 5-element pattern, with PEP as the byte-identity-strict exemplar. ## §11 Collisions case study **Source:** `macton/differentiable-collisions-optc` at `main` (5 commits); `README.md` (full); `src-optimized/OPTIMIZATION-LOG.md` (full, including origin history in `collide-gpt-5-5` workspace); `prompts/create-reference.md` (full); `prompts/create-optimized-test-harness.md` (full); `prompts/create-optimized.md` (full, per §9); `prompts/create-visualizer.md` (full); `prove-optimized-harness.sh` (full, per §3).