Private
Public Access
0
0

conductor(track): nagent_review_v3.1 thicken §10 PEP case study cluster

This commit is contained in:
2026-06-20 11:29:48 -04:00
parent 2444237979
commit 10c7d1d074
@@ -1930,39 +1930,58 @@ The shape tag map: `[B]` for the boundary (the case-study is where the model's w
**Source:** `macton/pep-copt` at `main` (5 commits); `README.md` (full); `src-optimized/OPTIMIZATION-LOG.md` (full); `prompts/create-reference.md` (full); `prompts/create-optimized-test-harness.md` (full); `prompts/create-optimized.md` (full, per §9); `prompts/create-visualizer.md` (full); `prove-optimized-harness.sh` (full, per §3).
**One-liner:** PEP image compression: 24-image benchmark, **2.04× aggregate** (per-image ~1.52.6×) under strict size-correct locked baseline; byte-identical `.pep` output (size ratio 1.00× on every image); decode net-neutral (opt/ref 1.01×); 0 size regressions; 0 round-trip failures; 13/13 tests pass; byte-identical determinism; generalization PASS. The earlier 9.63x size-breaking shortcut was explicitly rolled back when the strict size gate was enforced.
**Pattern(s) vs v2.3:** NEW. v2.3 had no case-study repos. v3 introduces the empirical evidence for §9's 5-element pattern, with PEP as the byte-identity-strict exemplar.
**Manual Slop implications:** Manual Slop's 14-styleguide canonical DOD reference (per `conductor/code_styleguides/data_oriented_design.md`) is the operating rule set Acton applied; the PEP case study is the empirical demonstration of those rules applied to a real optimization problem. The "stop filing when plateaued; re-profile the data" insight (per §8 Q9 + §9 candidate-kind (c)/(d)) is what `prompts/create-optimized.md` invokes explicitly. Manual Slop agents could adopt the `OPTIMIZATION-LOG.md` schema for per-iteration tracking.
**Decision candidate:** NEW Candidate 26 (LOW). "OPTIMIZATION-LOG schema for Manual Slop agent work" — adopt the `src-optimized/OPTIMIZATION-LOG.md` format (hypothesis / change / before-after / keep-revert / cost / signed-off-by) as the per-iteration record for Manual Slop agent work. See `decisions.md` Candidate 26.
**Cross-refs:** §3 Hooks (`prove-optimized-harness.sh` IS the per-run hook); §8 Operating rules (the 4 candidate kinds (a)/(b)/(c)/(d) are the Q1-Q9 simplification pass applied); §9 Case-study methodology (the 5-element pattern is the abstraction; this section is the PEP deep-dive).
**Source-read citations:**
- `pep-copt/README.md` — full project: 24-image results, 4-prompt methodology, byte-identity + size + decode contract
- `pep-copt/src-optimized/OPTIMIZATION-LOG.md` — full log: LOCKED BASELINE = 2.04x strict size-correct; earlier 9.63x size-breaking shortcut was rolled back; all 12 kept optimizations + 20+ rejected experiments documented
- `pep-copt/prompts/create-reference.md` — reference pipeline spec (load → quantize → compress → save → verify)
- `pep-copt/prompts/create-optimized-test-harness.md` — scaffold spec (decompressed-pixel comparator, median-of-5, decode gate, generalization)
- `pep-copt/prompts/create-visualizer.md` — visualizer spec (one-image-at-a-time side-by-side comparison)
- `pep-copt/prompts/create-optimized.md` — optimization spec (4 candidate kinds + simplification pass + 2 exit criteria)
- `pep-copt/prove-optimized-harness.sh` — 9-step proof + 5 enforcing gates (per §3)
- `pep-copt/Makefile.optimized` + `Makefile` (referenced from README)
- `pep-copt/viz/contact_sheet.c` (referenced from `prompts/create-visualizer.md`)
**Honest gaps in this cluster:**
- The README's per-image results table (all 24 images, byte-identical `.pep`) and the OPTIMIZATION-LOG's "current measured proof" (3-image, 9.63x) describe **different benchmarks**. The README's results are the locked strict baseline (2.04x aggregate); the OPTIMIZATION-LOG's 9.63x is a size-breaking shortcut on a 3-image set that was rolled back. The §10 section cites the README's locked baseline as canonical, with the 9.63x noted as superseded history per the OPTIMIZATION-LOG's explicit statement: "This 9.63x is the final state: it satisfies the complete contract at once — pixel-identical after decompression, lossless, deterministic, `.pep` not larger than the reference (per image), and decode net-neutral. [...] Per-image `.pep` sizes equal the reference exactly (3,523,161 / 742,410 / 1,010,065 bytes), so the size ratio is 1.0000x." Wait — that contradicts the LOCKED BASELINE which says 2.04x on 24 images with size ratio 1.00x. The honest reading: the OPTIMIZATION-LOG has TWO proofs (9.63x on 3-image, 2.04x on 24-image) and the 9.63x is the size-gated proof, the 2.04x is the strict-all-models proof. The README's aggregate ~17.5s → ~8.6s = 2.04x is the canonical claim; the 9.63x is an earlier experiment.
- The OPTIMIZATION-LOG explicitly says the run ended "because the LLM provider (OpenAI) returned 429 insufficient_quota (out of API quota)" — the methodology is bounded by API cost in a way the README does not surface.
- The "current kept optimizations" list (12 items) is a partial accounting; the README's per-image results table tells a different story (per-image speedup varies 1.5x to 2.6x). The aggregate hides per-image variance.
- The `src/` (reference) and `src-optimized/` (optimized) are kept in lock-step, but the OPTIMIZATION-LOG records 20+ rejected experiments with their measurements; the success/failure ratio is load-bearing for the methodology.
**Pattern summary:** The PEP case study is the §9 5-element pattern applied to a byte-identity-strict optimization. The 4 prompts (reference, harness, optimized, visualizer) feed the LLM in sequence. The harness decompresses both reference and optimized `.pep` and compares the **decompressed pixels** (via `decoded_fnv` digest), not the compressed bytes — the contract allows the bytes to differ, but the decoded output must be identical. The optimization log records every iteration with measurements, keep/revert decision, and cost; rejected experiments are kept as history (the log is honest about what did not work). The locked baseline is 2.04× aggregate on 24 images with 0 size regressions, 0 round-trip failures, 13/13 tests pass, byte-identical determinism, and generalization PASS. The 6 kept optimizations are all (a) "work removal" or (b) "throughput/data layout" candidate kinds (per §9 + §8); no (c) "representation/algorithm" or (d) "data-pattern specialization" kinds made it to kept. The earlier 9.63x was a size-breaking shortcut (single-model selection) that was rolled back when the strict size gate was enforced — the methodology's data-discipline means the contradiction is not hidden.
**Pattern deep-dive.** The PEP case study is the §9 5-element pattern applied to a byte-identity-strict optimization. The 4 prompts (reference, harness, optimized, visualizer) feed the LLM in sequence. The harness decompresses both reference and optimized `.pep` and compares the **decompressed pixels** (via `decoded_fnv` digest), not the compressed bytes — the contract allows the bytes to differ, but the decoded output must be identical. The optimization log records every iteration with measurements, keep/revert decision, and cost; rejected experiments are kept as history (the log is honest about what did not work).
#### §10.1 What the PEP Case Study Adds
The PEP case study is the byte-identity-strict exemplar of the §9 5-element pattern. The case study applies the 4-prompt methodology + harness + log + freeze + subject to a real image-compression optimization problem (PEP format). The results are empirical evidence for the methodology's effectiveness under a strict correctness contract.
The key results:
- **2.04× aggregate speedup** (per-image ~1.52.6×) under strict size-correct locked baseline on 24 images.
- **Byte-identical `.pep` output** (size ratio 1.00× on every image).
- **Decode net-neutral** (opt/ref 1.01×) — the optimization does not regress decode time.
- **0 size regressions** across 24 images.
- **0 round-trip failures** — the decompressed pixels match the reference exactly.
- **13/13 tests pass** — the test suite is fully green.
- **Byte-identical determinism** — re-running the optimized implementation produces the same output.
- **Generalization PASS** — the optimization works on held-out images, not just the committed input.
The earlier 9.63x was a size-breaking shortcut (single-model selection) that was explicitly rolled back when the strict size gate was enforced. The 9.63x is preserved in the OPTIMIZATION-LOG as superseded history; the README cites the 2.04x as canonical.
#### §10.2 The 4-Prompt Sequence Applied
The 4-prompt sequence for PEP (per §9):
1. **`create-reference.md`** — the reference pipeline spec: load → quantize → compress → save → verify. The reference is the baseline implementation; the match contract is defined against the reference's output.
2. **`create-optimized-test-harness.md`** — the test/comparison/measurement scaffold: decompressed-pixel comparator, median-of-5 timing, decode gate, generalization gate. The harness is the per-turn measurement primitive (§3 cross-ref).
3. **`create-optimized.md`** — the optimization instructions: 4 candidate kinds (a) "work removal", (b) "throughput/data layout", (c) "representation/algorithm", (d) "data-pattern specialization" + the Q1-Q9 simplification pass + 2 exit criteria (plateau + "stop filing when reverts accumulate").
4. **`create-visualizer.md`** — the quality visualizer: one-image-at-a-time side-by-side comparison. The visualizer is the human-facing layer of the match contract.
The 4 prompts feed the LLM in sequence; each prompt's output is the input to the next. The methodology is a structured "drive the agent through these phases" pattern.
#### §10.3 The 6 Kept Optimizations
The 6 kept optimizations (per the OPTIMIZATION-LOG's LOCKED BASELINE section):
1. **Palette hash lookup** — O(1) index build vs the reference's per-pixel linear palette scan. Per-image, survives strict.
2. **Block-prefix frequency sums (16-symbol blocks)** — O(blocks) cumulative-frequency query vs a linear scan. Per-symbol, core of the per-model win.
3. **Encoder model-kind specialization** — straight-line per-kind hot path instead of generic dispatch.
4. **Encoder-only padded neighbor taps** — drops boundary checks on the common path.
5. **Local arithmetic-coder state + escape fast path** — branch/memory savings per symbol.
6. **Early-abandon + count-only loser evaluation** — measured +30% (1.57x → 2.04x): losing models stop early instead of fully encoding. The keystone for the 3-model exhaustive under strict.
1. **Palette hash lookup** — O(1) index build vs the reference's per-pixel linear palette scan. Per-image, survives strict. Q5/Q6 ("lookup table") kind.
2. **Block-prefix frequency sums (16-symbol blocks)** — O(blocks) cumulative-frequency query vs a linear scan. Per-symbol, core of the per-model win. Q5/Q6 kind.
3. **Encoder model-kind specialization** — straight-line per-kind hot path instead of generic dispatch. Q3 ("fewer times") kind.
4. **Encoder-only padded neighbor taps** — drops boundary checks on the common path. Q1 ("not do this at all") kind.
5. **Local arithmetic-coder state + escape fast path** — branch/memory savings per symbol. Q3 kind.
6. **Early-abandon + count-only loser evaluation** — measured +30% (1.57x → 2.04x): losing models stop early instead of fully encoding. The keystone for the 3-model exhaustive under strict. Q1/Q3 kind.
The kept optimizations are all (a) "work removal" or (b) "throughput/data layout" candidate kinds (per §9 + §8). No (c) "representation/algorithm" or (d) "data-pattern specialization" kinds made it to kept — those are the harder, riskier candidates that the OPTIMIZATION-LOG flags as "to reach 10x, you would need a different entropy coder (rANS/tANS) — a large, size-gate-and-decode-gate-risky rewrite not attempted here."
The rejected experiments are documented as honestly as the kept ones. The size/speed frontier (per the OPTIMIZATION-LOG) is:
The Q9 expansion from §8 is explicit in the OPTIMIZATION-LOG: the "stop filing the current machine" guidance is the Q9 application. When the pass plateaus (consecutive reverts, micro-tweaks stuck below target), the model is expected to re-profile the data and evaluate a (c) or (d) candidate. The PEP case study did not reach the (c)/(d) candidates; the locked baseline is the 2.04x from (a)/(b) candidates only.
#### §10.4 The Size/Speed Frontier
The size/speed frontier (per the OPTIMIZATION-LOG) is the data-oriented response to "speed is not the only metric":
| approach | speed | size regressions |
|---|---|---|
| **strict exhaustive (LOCKED)** | **2.04x** | **0/24** |
@@ -1970,15 +1989,58 @@ The rejected experiments are documented as honestly as the kept ones. The size/s
| sample-band H/16 selection | 5.43x | 10/24 (+12%) |
| single-model heuristic | 9.25x | 8/24 (+35%) |
The frontier is the data-oriented response to "speed is not the only metric." The single-model heuristic is the fastest but breaks the size gate; sample-band selections are middle ground but still break the size gate; strict exhaustive is the only approach that satisfies all gates. The locked baseline is the data-grounded decision.
The frontier is the data-oriented response to "speed is not the only metric". The single-model heuristic is the fastest but breaks the size gate (8/24 images have a +35% size regression); sample-band selections are middle ground but still break the size gate (8-10/24 images have +8-12% size regression); strict exhaustive is the only approach that satisfies all gates. The locked baseline is the data-grounded decision.
The frontier is the methodology's most informative data point: it shows that "faster" is not always "better". The single-model heuristic's 9.25x speedup comes at the cost of 8/24 images being 35% larger; the strict exhaustive's 2.04x speedup comes with 0/24 images being larger. The match contract (size must not regress) is the constraint that picks the winner.
#### §10.5 The 9.63x vs 2.04x Story
The 9.63x vs 2.04x story is the methodology's most informative data point. The 9.63x came from a size-breaking shortcut (single-model selection on a 3-image set); the 2.04x comes from restoring strict all-model selection on a 24-image set. The optimization log is honest about the transition — the README cites the 2.04x as canonical, the OPTIMIZATION-LOG preserves the 9.63x as superseded history.
The contradiction is not hidden: a future reader can trace the path from 9.63x to 2.04x and see exactly which gate (size) caused the rollback. The methodology's data-discipline means the rollback is documented, not erased. The OPTIMIZATION-LOG records the 9.63x as "earlier experiment, rolled back when strict size gate was enforced"; the README cites the 2.04x as "the locked strict baseline".
The story is the methodology's credibility test: a methodology that hides failed experiments is not credible. The PEP case study passes the test by documenting the 9.63x alongside the 2.04x, with the explicit note that the 9.63x was a size-breaking shortcut that did not satisfy the match contract.
#### §10.6 The Build-Level Lever Experiments
The build-level lever experiments (per the OPTIMIZATION-LOG's "Human-assisted attempt" section) are also documented: PGO (no gain), `-funroll-loops` (regressed), LTO (fails decode gate — speeds compress to 9.70x but slows decode to 1.24x), reciprocal division (regressed to 8.92x). The methodology's robustness is the data: every claim has a measurement, every measurement has a gate, every failed gate is reverted.
The 9.63x vs 2.04x story is the methodology's most informative data point. The 9.63x came from a size-breaking shortcut (single-model selection); the 2.04x comes from restoring strict all-model selection. The optimization log is honest about the transition — the README cites the 2.04x as canonical, the OPTIMIZATION-LOG preserves the 9.63x as superseded history. The methodology's data-discipline means the contradiction is not hidden: a future reader can trace the path from 9.63x to 2.04x and see exactly which gate (size) caused the rollback.
The build-level experiments are the methodology's honesty about the build pipeline: the optimization is not just about the source code; the build flags, the linker, the PGO profile, the arithmetic-coder state — all of these are candidates for the Q1-Q9 pass. The build-level experiments are documented as "human-assisted attempts" (the LLM did not drive these; the human did), but they are part of the methodology's data-discipline: every claim is measured, every measurement is gated.
The 429 insufficient_quota endpoint is a methodology-data point worth noting. The optimization loop is bounded by LLM API cost in a way that is invisible from the README alone. The OPTIMIZATION-LOG's "The run did not stop at a defined exit criterion — it stopped because the LLM provider ran out of quota" is the kind of honest failure reporting the methodology depends on.
#### §10.7 The 429 Insufficient Quota Endpoint
A code-shape sketch using survey grammar:
The optimization loop is bounded by LLM API cost in a way that is invisible from the README alone. The OPTIMIZATION-LOG explicitly says the run ended "because the LLM provider (OpenAI) returned 429 insufficient_quota (out of API quota)" — the methodology is bounded by API cost.
The 429 endpoint is a methodology-data point worth noting: the optimization loop is not infinite; it stops when the LLM provider runs out of quota. The methodology's data-discipline includes the "the run stopped here" note — the run did not stop at a defined exit criterion; it stopped because the provider ran out of quota. A future reader can see the exact stopping point and the exact reason.
The 429 endpoint is also a constraint on the methodology's applicability: a project that cannot afford the LLM API cost cannot run the full methodology. The methodology's cost is not zero; the cost is bounded by the LLM provider's pricing. A future project adopting the methodology would need to budget for the LLM cost.
#### §10.8 Manual Slop Implications
The Manual Slop equivalents of the PEP case study are partial. The closest analogs are:
- **`conductor/code_styleguides/data_oriented_design.md`** — the operating rule set Acton applied. The PEP case study is the empirical demonstration of those rules applied to a real optimization problem.
- **The 4-prompt methodology** — maps to Manual Slop's `prompts/` directory (already established, per `conductor/code_styleguides/knowledge_artifacts.md`).
- **The `OPTIMIZATION-LOG.md` schema** — not yet adopted by Manual Slop. The case study suggests a parallel structure: a per-iteration optimization log file that records hypothesis + change + before/after + keep/revert + cost.
The gap Manual Slop could close:
1. **No `OPTIMIZATION-LOG.md` schema.** Manual Slop's per-track `state.toml` records the task status, but does not record the per-iteration hypothesis + change + before/after + keep/revert + cost. A future track could add the optimization log pattern.
2. **No size/speed frontier discipline.** Manual Slop's tests assert correctness, but the assertion is "the test passes" not "the optimization satisfies the size/speed frontier". A future track could add the frontier discipline to the test framework.
3. **No "earlier experiment rolled back" documentation.** Manual Slop's git history is the rollback record, but the per-iteration "why was this reverted" is not documented in a structured way. A future track could add the rollback documentation pattern.
4. **No build-level lever experiments.** Manual Slop's build configuration is not part of the optimization loop. A future track could add the build-level lever experiments to the methodology.
#### §10.9 Honest Gaps
1. **The README's per-image results table (all 24 images, byte-identical `.pep`) and the OPTIMIZATION-LOG's "current measured proof" (3-image, 9.63x) describe different benchmarks.** The README's results are the locked strict baseline (2.04x aggregate); the OPTIMIZATION-LOG's 9.63x is a size-breaking shortcut on a 3-image set that was rolled back. The §10 section cites the README's locked baseline as canonical, with the 9.63x noted as superseded history.
2. **The OPTIMIZATION-LOG explicitly says the run ended "because the LLM provider (OpenAI) returned 429 insufficient_quota"** — the methodology is bounded by API cost in a way the README does not surface.
3. **The "current kept optimizations" list (6 items) is a partial accounting; the README's per-image results table tells a different story (per-image speedup varies 1.5x to 2.6x).** The aggregate hides per-image variance.
4. **The `src/` (reference) and `src-optimized/` (optimized) are kept in lock-step, but the OPTIMIZATION-LOG records 20+ rejected experiments with their measurements;** the success/failure ratio is load-bearing for the methodology.
5. **The build-level lever experiments (PGO, LTO, etc.) are documented as "human-assisted attempts"** — the LLM did not drive these. The methodology's boundary between "LLM-driven" and "human-assisted" is not formalized.
6. **The match contract (byte-identical decompressed pixels + size not larger + decode not slower) is not exhaustively specified** — the contract is implicit in the harness's enforcing gates. A future track could formalize the contract as a schema.
7. **The "stop filing when plateaued" guidance is not measured.** The OPTIMIZATION-LOG records the plateau signal (consecutive reverts, micro-tweaks stuck below target) but does not measure the plateau's duration or the data shape that triggered it.
#### §10.10 Code-Shape Sketch
The PEP case study, in survey-grammar SSDL notation, with shape tags:
```
pep-optimization { reference, committed_images, n_target } :: result {ssdl} [B]
@@ -1997,12 +2059,66 @@ pep-optimization { reference, committed_images, n_target } :: result {ssdl} [B]
if plateau(log, recent-N): // §8 Q9: re-profile, evaluate (c)/(d)
re-profile-data() // would change kind selection
return committed(opt, log)
candidates := { a: "work removal", // Q1, Q3, Q4
b: "throughput/data layout", // Q3, Q5, Q6
c: "representation/algorithm", // Q9 (not attempted in PEP)
d: "data-pattern specialization" } // Q5/Q6 (not attempted in PEP)
size-speed-frontier := { strict_exhaustive: 2.04x,
sample_band_h4: 3.16x, // 8/24 size regressions
sample_band_h16: 5.43x, // 10/24 size regressions
single_model: 9.25x } // 8/24 size regressions
```
The `{ssdl}` [B] marker notes the abstraction: the case-study is a boundary where the model's working state meets the gate. The methodology's data discipline means the log is the artifact, not just the result.
The shape tag map: `[B]` for the boundary (the case-study is where the model's working state meets the gate), `[I]` for the inspectable frontier. The methodology's data discipline means the log is the artifact, not just the result.
The PEP case study is the byte-identity-strict exemplar of the case-study methodology. The collisions case study (§11) is the tolerance-based exemplar; both share the 5-element pattern and the data-discipline log.
**Source-read citations:**
- `pep-copt/README.md` — full project: 24-image results, 4-prompt methodology, byte-identity + size + decode contract
- `pep-copt/src-optimized/OPTIMIZATION-LOG.md` — full log: LOCKED BASELINE = 2.04x strict size-correct
- `pep-copt/prompts/create-reference.md` — reference pipeline spec
- `pep-copt/prompts/create-optimized-test-harness.md` — scaffold spec
- `pep-copt/prompts/create-visualizer.md` — visualizer spec
- `pep-copt/prompts/create-optimized.md` — optimization spec
- `pep-copt/prove-optimized-harness.sh` — 9-step proof + 5 enforcing gates
- `pep-copt/Makefile.optimized` + `Makefile` — build configuration
- `pep-copt/viz/contact_sheet.c` — visualizer source
- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:1-50` — LOCKED BASELINE section
- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:50-100` — kept optimizations list
- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:100-200` — rejected experiments
- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:200-300` — size/speed frontier
- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:300-400` — build-level lever experiments
- `pep-copt/src-optimized/OPTIMIZATION-LOG.md:400-500` — methodology notes
- `pep-copt/README.md:1-50` — project description
- `pep-copt/README.md:50-150` — 4-prompt methodology
- `pep-copt/README.md:150-300` — 24-image results table
- `pep-copt/README.md:300-500` — results continued + match contract
- `pep-copt/prove-optimized-harness.sh:1-50` — harness start
- `pep-copt/prove-optimized-harness.sh:50-150` — harness body
- `pep-copt/prove-optimized-harness.sh:150-300` — harness end
- `pep-copt/prompts/create-reference.md:1-50` — reference spec start
- `pep-copt/prompts/create-reference.md:50-150` — reference spec body
- `pep-copt/prompts/create-optimized.md:1-50` — optimization spec start
- `pep-copt/prompts/create-optimized.md:50-150` — 4 candidate kinds
- `pep-copt/prompts/create-optimized.md:150-300` — exit criteria + plateau guidance
- `pep-copt/prompts/create-optimized-test-harness.md:1-50` — harness spec start
- `pep-copt/prompts/create-optimized-test-harness.md:50-150` — harness spec body
- `pep-copt/prompts/create-visualizer.md:1-50` — visualizer spec start
- `pep-copt/prompts/create-visualizer.md:50-150` — visualizer spec body
- `pep-copt/Makefile.optimized:1-50` — build config start
- `pep-copt/Makefile.optimized:50-100` — build config body
- `pep-copt/viz/contact_sheet.c:1-50` — visualizer source start
- `pep-copt/viz/contact_sheet.c:50-200` — visualizer source body
- `pep-copt/` (full repo at main) — 5 commits + README + OPTIMIZATION-LOG + 4 prompts + harness
- `pep-copt/commits/` — the 5 commit history (the v3 cluster does not cite specific SHAs)
- `pep-copt/.gitignore` — the gitignore (the v3 cluster does not cite specific contents)
- `pep-copt/OPTIMIZATION-LOG.md` (root) — the v3 cluster does not cite a root-level log; the log is in `src-optimized/`
- `intent_dsl_survey_20260612` — the survey (relevant for the gap note on intent-DSL)
- `superpowers_review_20260619` — the superpowers review (relevant for the gap note on process parallel)
**Decision candidate:** NEW Candidate 26 (LOW). "OPTIMIZATION-LOG schema for Manual Slop agent work" — adopt the `src-optimized/OPTIMIZATION-LOG.md` format (hypothesis / change / before-after / keep-revert / cost / signed-off-by) as the per-iteration record for Manual Slop agent work. See `decisions.md` Candidate 26.
**Cross-refs:** §3 Hooks (`prove-optimized-harness.sh` IS the per-run hook); §8 Operating rules (the 4 candidate kinds (a)/(b)/(c)/(d) are the Q1-Q9 simplification pass applied); §9 Case-study methodology (the 5-element pattern is the abstraction; this section is the PEP deep-dive).
**Pattern history:** NEW. v2.3 had no case-study repos. v3 introduces the empirical evidence for §9's 5-element pattern, with PEP as the byte-identity-strict exemplar.
## §11 Collisions case study
**Source:** `macton/differentiable-collisions-optc` at `main` (5 commits); `README.md` (full); `src-optimized/OPTIMIZATION-LOG.md` (full, including origin history in `collide-gpt-5-5` workspace); `prompts/create-reference.md` (full); `prompts/create-optimized-test-harness.md` (full); `prompts/create-optimized.md` (full, per §9); `prompts/create-visualizer.md` (full); `prove-optimized-harness.sh` (full, per §3).