docs(ascii-dsl): add §8 Screenshot-to-ASCII Reverse Engineering (opt-in extension)
Documents the MiniMax_understand_image workflow for converting screenshots to ASCII Layout Maps. Covers: when to use it, the 6-step workflow, the proportional-measurement prompt pattern, faithful rendering rules (width ratios, empty space, floating window position, color annotations, tab bars, table rows), multi-screenshot composition, and limitations.
This commit is contained in:
@@ -271,3 +271,72 @@ Once a design contract is locked and implemented, it must pass a three-tiered ve
|
||||
1. **AST Integrity:** Every docstring modification must pass `py_check_syntax` to ensure it doesn't break python parsing.
|
||||
2. **Regression Check:** The test runner (`pytest tests/`) must be run to verify zero side-effects. Docstring additions must never alter execution logic.
|
||||
3. **Puppeteer Visual Audit:** In visual simulation tests, the captured Dear ImGui layout boundaries and widget visibility flags are compared against the rows, columns, and conditional states defined in the ASCII design contract.
|
||||
|
||||
---
|
||||
|
||||
## 8. Screenshot-to-ASCII Reverse Engineering (Opt-In Extension)
|
||||
|
||||
When a running GUI state needs to be captured as an ASCII Layout Map — for bug reports, regression documentation, or Tier 2 handoff — the `MiniMax_understand_image` MCP tool can reverse-engineer a screenshot into the DSL. This is an **opt-in** workflow; the standard DSL (§1-§7) remains the forward-design path (text-first, code-second). This section covers the reverse path (screenshot-first, text-second).
|
||||
|
||||
### 8.1 When to Use This Extension
|
||||
|
||||
- **Bug reports**: the user sees a broken layout and screenshots it; the agent converts to ASCII for the report
|
||||
- **Regression documentation**: before/after screenshots converted to ASCII pairs to document what changed
|
||||
- **Tier 2 handoff**: the user provides a screenshot of the current working state; Tier 1 converts to ASCII so Tier 2 can see the target layout without running the GUI
|
||||
- **Layout audit**: the user provides a screenshot of a misbehaving panel; the agent converts to ASCII to reason about the structure
|
||||
|
||||
### 8.2 The Workflow
|
||||
|
||||
```
|
||||
Step 1: User provides screenshot file path(s)
|
||||
Step 2: Agent calls MiniMax_understand_image with a proportional-measurement prompt
|
||||
Step 3: Agent converts the structured description into an ASCII Layout Map
|
||||
Step 4: User reviews + corrects proportions ("the left panel is wider", "the Debug window is top-right not center")
|
||||
Step 5: Agent revises until the ASCII faithfully represents the screenshot
|
||||
Step 6: The final ASCII map is committed to docs or a track spec
|
||||
```
|
||||
|
||||
### 8.3 The Proportional-Measurement Prompt
|
||||
|
||||
The first `MiniMax_understand_image` call must ask for **precise proportional measurements**, not just a list of elements. The prompt should request:
|
||||
|
||||
1. Panel width percentages (left panel X%, right panel Y%)
|
||||
2. Vertical order and height proportions of each section within each panel
|
||||
3. Exact position of floating/overlay windows (which panel, which corner, relative size)
|
||||
4. Exact text labels, button labels, tab names, checkbox states
|
||||
5. Color annotations for status text (red for errors, green for success, blue for info)
|
||||
6. Empty space proportions (how much of each panel is blank)
|
||||
|
||||
Without proportional measurements, the resulting ASCII will be "scrunched" — elements compressed into too-small areas, losing the visual hierarchy that makes the layout map useful.
|
||||
|
||||
### 8.4 Faithful Rendering Rules
|
||||
|
||||
When converting the structured description to ASCII:
|
||||
|
||||
- **Width ratios must be preserved.** If the left panel is 25% and the right is 75%, the ASCII must show the left panel as roughly 1/4 the total width and the right as 3/4. Do not make them 50/50.
|
||||
- **Empty space must be represented.** If 80% of a panel is blank, the ASCII must show that blank space as empty lines within the panel border. Do not compress it away.
|
||||
- **Floating windows must be positioned correctly.** If the Debug window is top-right of the Discussion Hub, it must appear in the top-right area of the right panel in the ASCII, not centered or bottom.
|
||||
- **Color annotations use inline markers.** Red text: `1 failed` with a note `^^^ in red`. Green text: `OUT request` with a note. Blue text: `tool_call` with a note.
|
||||
- **Tab bars list all tabs.** Even inactive tabs must appear so the reader can see the full navigation surface.
|
||||
- **Tables show all visible rows.** The telemetry table with 4 data rows must show all 4 rows, not just 1-2.
|
||||
|
||||
### 8.5 Multi-Screenshot Composition
|
||||
|
||||
When the user provides multiple screenshots (e.g., different panel configurations, before/after states), each gets its own ASCII Layout Map. The maps are presented sequentially with a header line identifying the screenshot source:
|
||||
|
||||
```
|
||||
**Screenshot 1** (timestamp) — Panel A + Panel B:
|
||||
<ASCII map>
|
||||
|
||||
**Screenshot 2** (timestamp) — Panel A + Panel C + Debug overlay:
|
||||
<ASCII map>
|
||||
```
|
||||
|
||||
Do not attempt to merge multiple screenshots into a single composite ASCII. Each screenshot is its own layout state.
|
||||
|
||||
### 8.6 Limitations
|
||||
|
||||
- The `MiniMax_understand_image` tool cannot read images from the clipboard directly; the user must provide a file path (e.g., a ShareX screenshot path).
|
||||
- The proportional measurements are estimates, not pixel-perfect. The user must review and correct.
|
||||
- Complex layouts with many small elements may lose resolution in the ASCII. Use the Feature Zooming technique (§4.1) to decompose dense areas into zoomed micro-layouts.
|
||||
- Color information is lost in ASCII. Use inline text annotations (`^^^ in red`) to preserve critical color signals.
|
||||
|
||||
Reference in New Issue
Block a user