blog: add SVG diagrams to pi-flow-native-brain (gate pipeline + deletion bar), real table; build.js raw-HTML passthrough

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-03 02:06:21 +01:00
parent 3868be2f3a
commit a43dbf71a5
3 changed files with 149 additions and 26 deletions
+74 -14
View File
@@ -33,16 +33,50 @@ The core pieces:
- **Oracle-backed gates.** The `verify_build` tool (`.pi/extensions/tinqs-verify.ts`) is the canonical gate. It compiles the game and sim, runs tests, and returns a structured PASS/FAIL verdict with file:line errors. Agents route through it; the gate decides whether to proceed.
- **Agent-loop-decision Reflexion.** Instead of a fixed two-phase TDAID loop, agents self-reflect on build failures. The flow engine gives them the failure report; they decide whether to fix and retry or escalate.
- **Role-split agents.** Build-verifier (G1), test-runner (G2), and vision-QA (G3) are separate sub-agents, each with their own toolset and context, composed by the flow.
- **Role-split agents.** G1 build, G2 tests, G3 behaviour (drives the live game), G4 feel (measured game-feel) and G5 visual (animation) are separate sub-agents, each with its own toolset and context, composed by the flow.
The result is a pipeline that flows naturally:
The result is a pipeline that flows naturally — a plan, an implementation, then a ladder of oracle-backed gates:
```
context → build → build-gate → (pass? → tests → tests-gate → vision)
↘ (fail? → report)
```
<!--raw-->
<figure style="margin:28px 0;">
<svg viewBox="0 0 920 350" role="img" aria-label="The verify-heavy flow: context, plan, implement, five gates, a Reflexion loop, and one judge" style="width:100%;height:auto;display:block;background:#0a0e14;border:1px solid #2a3340;border-radius:12px;font-family:'IBM Plex Sans',system-ui,sans-serif;">
<defs>
<marker id="ah" markerWidth="10" markerHeight="10" refX="7" refY="3.2" orient="auto"><path d="M0,0 L7,3.2 L0,6.4 Z" fill="#5b6b7d"/></marker>
<marker id="ahA" markerWidth="10" markerHeight="10" refX="7" refY="3.2" orient="auto"><path d="M0,0 L7,3.2 L0,6.4 Z" fill="#f59e0b"/></marker>
</defs>
<rect x="40" y="40" width="140" height="46" rx="9" fill="#121821" stroke="#2a3340"/>
<text x="110" y="68" text-anchor="middle" fill="#cdd7e2" font-size="15">Context</text>
<rect x="210" y="40" width="140" height="46" rx="9" fill="#121821" stroke="#2a3340"/>
<text x="280" y="68" text-anchor="middle" fill="#cdd7e2" font-size="15">Plan</text>
<rect x="400" y="40" width="150" height="46" rx="9" fill="#15202e" stroke="#3a4656"/>
<text x="475" y="68" text-anchor="middle" fill="#e6edf3" font-size="15">Implement</text>
<line x1="180" y1="63" x2="206" y2="63" stroke="#5b6b7d" stroke-width="1.6" marker-end="url(#ah)"/>
<line x1="350" y1="63" x2="396" y2="63" stroke="#5b6b7d" stroke-width="1.6" marker-end="url(#ah)"/>
<rect x="40" y="150" width="840" height="82" rx="12" fill="#0c1119" stroke="#2a3340"/>
<text x="56" y="171" fill="#6b7a8d" font-size="11" letter-spacing="1.4">VERIFY-HEAVY GATES — most compute is spent checking, not writing</text>
<rect x="56" y="180" width="148" height="42" rx="8" fill="#10141c" stroke="#38bdf8" stroke-opacity="0.55"/>
<text x="130" y="206" text-anchor="middle" fill="#38bdf8" font-size="13.5">G1 · Build</text>
<rect x="222" y="180" width="148" height="42" rx="8" fill="#10141c" stroke="#34d399" stroke-opacity="0.55"/>
<text x="296" y="206" text-anchor="middle" fill="#9fe6c0" font-size="13.5">G2 · Tests</text>
<rect x="388" y="180" width="148" height="42" rx="8" fill="#10141c" stroke="#a855f7" stroke-opacity="0.55"/>
<text x="462" y="206" text-anchor="middle" fill="#c4a0f7" font-size="13.5">G3 · Behaviour</text>
<rect x="554" y="180" width="148" height="42" rx="8" fill="#10141c" stroke="#f59e0b" stroke-opacity="0.55"/>
<text x="628" y="206" text-anchor="middle" fill="#f5b44b" font-size="13.5">G4 · Feel</text>
<rect x="720" y="180" width="148" height="42" rx="8" fill="#10141c" stroke="#c9935a" stroke-opacity="0.55"/>
<text x="794" y="206" text-anchor="middle" fill="#d9ac7b" font-size="13.5">G5 · Visual</text>
<line x1="475" y1="86" x2="475" y2="148" stroke="#5b6b7d" stroke-width="1.6" marker-end="url(#ah)"/>
<line x1="460" y1="232" x2="460" y2="276" stroke="#5b6b7d" stroke-width="1.6" marker-end="url(#ah)"/>
<text x="472" y="258" fill="#6b7a8d" font-size="11">all green &#8658; done&#8195;&#183;&#8195;any fail &#8658; report</text>
<rect x="380" y="278" width="160" height="46" rx="9" fill="#1b1505" stroke="#c9935a"/>
<text x="460" y="306" text-anchor="middle" fill="#f3d6a0" font-size="15">Judge &#8212; honest verdict</text>
<path d="M820,150 C 908,96 716,50 556,61" fill="none" stroke="#f59e0b" stroke-width="1.8" stroke-dasharray="6 5" marker-end="url(#ahA)"/>
<text x="694" y="96" fill="#f59e0b" font-size="12.5">Reflexion &#183; fix &amp; retry &#8804; 3</text>
</svg>
<figcaption style="color:#9aa7b4;font-size:0.85rem;margin-top:8px;">A real in-game failure loops back to <em>implement</em> with the gate evidence (bounded to three tries); anything green — or skipped because no live instance is running — falls through to a single honest judge.</figcaption>
</figure>
<!--/raw-->
Critically, the flow is not fixed. Agents can add gates, reorder steps, or branch on conditions. The flow engine handles orchestration; the agents handle decisions.
It started as three gates — build, test, vision. Gates are cheap to add, so it grew: a feature now also passes a live-game **behaviour** probe and a measured **feel** check before the judge signs off. Critically, the flow is not fixed. Agents can add gates, reorder steps, or branch on conditions. The flow engine handles orchestration; the agents handle decisions.
## What We Deleted
@@ -57,6 +91,21 @@ The commit removes 1,050 lines across 15 files:
- `events.ts` (47 lines) — inter-process event bus
- Plus tests, examples, and documentation
<!--raw-->
<figure style="margin:24px 0;">
<svg viewBox="0 0 920 180" role="img" aria-label="Lines of code: 1,050 deleted versus about 300 kept" style="width:100%;height:auto;display:block;background:#0a0e14;border:1px solid #2a3340;border-radius:12px;font-family:'IBM Plex Sans',system-ui,sans-serif;">
<text x="40" y="34" fill="#9aa7b4" font-size="13">Net change: <tspan fill="#f59e0b" font-weight="600">&#8722;750 lines</tspan>, &#43; a composable pipeline</text>
<text x="40" y="76" fill="#f0816a" font-size="13">Deleted</text>
<rect x="150" y="58" width="730" height="30" rx="6" fill="#2a1416" stroke="#f0816a" stroke-opacity="0.6"/>
<text x="868" y="78" text-anchor="end" fill="#f3b4a8" font-size="12.5">supervisor/ &#8212; 1,050 lines &#183; 15 files</text>
<text x="40" y="136" fill="#34d399" font-size="13">Kept</text>
<rect x="150" y="118" width="209" height="30" rx="6" fill="#0f2a22" stroke="#34d399" stroke-opacity="0.6"/>
<text x="369" y="138" fill="#9fe6c0" font-size="12.5">verify_build &#8212; ~300 lines &#183; 1 oracle</text>
</svg>
<figcaption style="color:#9aa7b4;font-size:0.85rem;margin-top:8px;">The whole orchestration loop was deleted; only the build oracle survived — and it became the gate that powers the flow.</figcaption>
</figure>
<!--/raw-->
None of this was bad code. It was just the wrong layer. Flows gives us all of this — orchestration, state, gates, retry policy, event routing — as a framework primitive. We were maintaining a parallel implementation of something the framework already provided.
The durable asset we kept: `verify_build`, the build oracle. It's now reused as the gate tool that powers the flow pipeline.
@@ -83,13 +132,24 @@ Verified live: `game-check` now routes `context → build → build-gate(pass)
## The Stack Today
| Layer | What | How |
|-------|------|-----|
| **Flow engine** | pi-flows orchestrator | Composes agents, gates, and decision points |
| **Gates** | verify_build oracle | Compiles, tests, returns PASS/FAIL with errors |
| **Sub-agents** | G1 (build), G2 (test), G3 (vision) | Role-split, each with its own toolset |
| **Decision** | Agent-loop Reflexion | Self-reflect on failures, retry or escalate |
| **Visualization** | FlowDashboard | Real-time pipeline state |
<!--raw-->
<table style="width:100%;border-collapse:collapse;margin:18px 0;font-size:0.92rem;">
<thead>
<tr style="text-align:left;border-bottom:1px solid #2a3340;">
<th style="padding:10px 12px;color:#c9935a;font-weight:600;">Layer</th>
<th style="padding:10px 12px;color:#c9935a;font-weight:600;">What</th>
<th style="padding:10px 12px;color:#c9935a;font-weight:600;">How</th>
</tr>
</thead>
<tbody>
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:9px 12px;color:#e6edf3;vertical-align:top;"><strong style="color:#f59e0b;">Flow engine</strong></td><td style="padding:9px 12px;color:#cdd7e2;vertical-align:top;">pi-flows orchestrator</td><td style="padding:9px 12px;color:#9aa7b4;vertical-align:top;">Composes agents, gates and decision points</td></tr>
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:9px 12px;color:#e6edf3;vertical-align:top;"><strong style="color:#f59e0b;">Gates</strong></td><td style="padding:9px 12px;color:#cdd7e2;vertical-align:top;">verify_build oracle</td><td style="padding:9px 12px;color:#9aa7b4;vertical-align:top;">Compiles, tests, returns PASS/FAIL with file:line errors</td></tr>
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:9px 12px;color:#e6edf3;vertical-align:top;"><strong style="color:#f59e0b;">Sub-agents</strong></td><td style="padding:9px 12px;color:#cdd7e2;vertical-align:top;">G1 build &#183; G2 tests &#183; G3 behaviour &#183; G4 feel &#183; G5 visual</td><td style="padding:9px 12px;color:#9aa7b4;vertical-align:top;">Role-split, each with its own toolset</td></tr>
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:9px 12px;color:#e6edf3;vertical-align:top;"><strong style="color:#f59e0b;">Decision</strong></td><td style="padding:9px 12px;color:#cdd7e2;vertical-align:top;">Agent-loop Reflexion</td><td style="padding:9px 12px;color:#9aa7b4;vertical-align:top;">Self-reflect on failures, retry (&#8804;3) or escalate</td></tr>
<tr><td style="padding:9px 12px;color:#e6edf3;vertical-align:top;"><strong style="color:#f59e0b;">Visualization</strong></td><td style="padding:9px 12px;color:#cdd7e2;vertical-align:top;">FlowDashboard</td><td style="padding:9px 12px;color:#9aa7b4;vertical-align:top;">Real-time pipeline state at localhost:33634</td></tr>
</tbody>
</table>
<!--/raw-->
---