796 lines
49 KiB
HTML
796 lines
49 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta charset="UTF-8">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||
|
||
<title>How Pi Agents Build, Test, and Ship Game Code with Oracle-Backed Flows — Tinqs Blog</title>
|
||
<meta name="description" content="We use Pi flows with oracle-backed gates to make agents compile, test, drive the live game, measure feel, fix CI failures, and ship green PRs — all autonomously.">
|
||
<meta name="robots" content="index, follow">
|
||
<link rel="canonical" href="https://www.tinqs.com/blog/pi-flow-native-brain">
|
||
|
||
<meta property="og:type" content="article">
|
||
<meta property="og:url" content="https://www.tinqs.com/blog/pi-flow-native-brain">
|
||
<meta property="og:title" content="How Pi Agents Build, Test, and Ship Game Code with Oracle-Backed Flows">
|
||
<meta property="og:description" content="Pi flows + oracle-backed gates: agents that compile, test, drive the game, measure feel, fix CI, and ship green PRs.">
|
||
<meta property="og:image" content="https://www.tinqs.com/img/og-cover.jpg">
|
||
|
||
<meta name="twitter:card" content="summary_large_image">
|
||
<meta name="twitter:title" content="How Pi Agents Build, Test, and Ship Game Code with Oracle-Backed Flows">
|
||
<meta name="twitter:description" content="Pi flows + oracle-backed gates: agents that compile, test, drive the game, measure feel, fix CI, and ship green PRs.">
|
||
<meta name="twitter:image" content="https://www.tinqs.com/img/og-cover.jpg">
|
||
|
||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||
<link href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500&family=Inter:wght@400;500;600;700&display=swap" rel="stylesheet">
|
||
|
||
<script type="application/ld+json">
|
||
{
|
||
"@context": "https://schema.org",
|
||
"@type": "BlogPosting",
|
||
"headline": "How Pi Agents Build, Test, and Ship Game Code with Oracle-Backed Flows",
|
||
"datePublished": "2026-06-04",
|
||
"author": {
|
||
"@type": "Person",
|
||
"name": "Ozan Bozkurt"
|
||
},
|
||
"publisher": {
|
||
"@type": "Organization",
|
||
"name": "Tinqs Limited",
|
||
"url": "https://www.tinqs.com"
|
||
},
|
||
"description": "We use Pi flows with oracle-backed gates to make agents compile, test, drive the live game, measure feel, fix CI failures, and ship green PRs — all autonomously."
|
||
}
|
||
</script>
|
||
|
||
<style>
|
||
/* ── Self-contained post styles (Studio provides site chrome) ── */
|
||
|
||
:root {
|
||
--c-bg: #0B0C0E;
|
||
--c-bg-raised: #15171A;
|
||
--c-fg: #ECEEF1;
|
||
--c-muted: #8A95A3;
|
||
--c-lime: #B6FF3C;
|
||
--c-violet: #7C5CFF;
|
||
--c-border: rgba(255,255,255,.07);
|
||
--c-border-strong: rgba(255,255,255,.12);
|
||
}
|
||
|
||
*, *::before, *::after { box-sizing: border-box; }
|
||
|
||
html { background: var(--c-bg); }
|
||
|
||
body {
|
||
margin: 0;
|
||
padding: 0;
|
||
background: var(--c-bg);
|
||
color: var(--c-fg);
|
||
font-family: 'Inter', system-ui, -apple-system, sans-serif;
|
||
font-size: 16px;
|
||
line-height: 1.6;
|
||
-webkit-font-smoothing: antialiased;
|
||
}
|
||
|
||
/* ── Post container ── */
|
||
.post {
|
||
background: var(--c-bg);
|
||
max-width: 720px;
|
||
margin: 0 auto;
|
||
padding: 48px 24px 60px;
|
||
}
|
||
|
||
/* ── Back link ── */
|
||
.post__back {
|
||
color: var(--c-muted);
|
||
text-decoration: none;
|
||
font-size: 0.875rem;
|
||
display: inline-block;
|
||
margin-bottom: 24px;
|
||
transition: color 0.15s;
|
||
}
|
||
.post__back:hover { color: var(--c-lime); }
|
||
|
||
/* ── Gradient title — lime → violet ── */
|
||
.post__title {
|
||
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
|
||
background: linear-gradient(90deg, var(--c-lime), var(--c-violet));
|
||
-webkit-background-clip: text;
|
||
background-clip: text;
|
||
color: transparent;
|
||
font-weight: 700;
|
||
font-size: 2.2rem;
|
||
line-height: 1.2;
|
||
margin: 0 0 16px;
|
||
}
|
||
|
||
/* ── Date pill ── */
|
||
.post__date {
|
||
display: inline-block;
|
||
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
|
||
font-size: 0.72rem;
|
||
letter-spacing: 0.18em;
|
||
text-transform: uppercase;
|
||
color: var(--c-muted);
|
||
border: 1px solid var(--c-border);
|
||
border-radius: 999px;
|
||
padding: 4px 14px;
|
||
margin-bottom: 16px;
|
||
}
|
||
|
||
/* ── Lead ── */
|
||
.post__lead {
|
||
color: var(--c-muted);
|
||
font-size: 1.08rem;
|
||
line-height: 1.7;
|
||
}
|
||
|
||
/* ── Body ── */
|
||
.post__body { font-size: 1rem; line-height: 1.7; }
|
||
|
||
.post__body p { margin: 14px 0; }
|
||
|
||
.post__body h2 {
|
||
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
|
||
font-weight: 600;
|
||
font-size: 1.6rem;
|
||
margin: 54px 0 6px;
|
||
padding-left: 16px;
|
||
border-left: 4px solid var(--c-lime);
|
||
line-height: 1.3;
|
||
}
|
||
|
||
.post__body h3 {
|
||
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
|
||
font-weight: 500;
|
||
color: var(--c-violet);
|
||
font-size: 1.15rem;
|
||
margin: 30px 0 4px;
|
||
}
|
||
|
||
.post__body h4, .post__body h5, .post__body h6 {
|
||
margin: 20px 0 4px;
|
||
}
|
||
|
||
/* ── Inline code ── */
|
||
.post__body code {
|
||
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
|
||
font-size: 0.84em;
|
||
background: var(--c-bg-raised);
|
||
color: var(--c-lime);
|
||
padding: 2px 6px;
|
||
border-radius: 4px;
|
||
border: 1px solid var(--c-border);
|
||
}
|
||
|
||
/* ── Code blocks ── */
|
||
.post__body pre {
|
||
background: var(--c-bg);
|
||
border: 1px solid var(--c-border);
|
||
border-radius: 8px;
|
||
padding: 16px 18px;
|
||
overflow-x: auto;
|
||
margin: 14px 0;
|
||
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
|
||
font-size: 0.83rem;
|
||
line-height: 1.55;
|
||
color: var(--c-fg);
|
||
}
|
||
|
||
.post__body pre code {
|
||
background: transparent;
|
||
padding: 0;
|
||
border: none;
|
||
font-size: inherit;
|
||
color: inherit;
|
||
border-radius: 0;
|
||
}
|
||
|
||
/* ── Blockquote ── */
|
||
.post__body blockquote {
|
||
background: rgba(124, 92, 255, 0.06);
|
||
border: 1px solid rgba(124, 92, 255, 0.15);
|
||
border-left: 4px solid var(--c-violet);
|
||
border-radius: 0 8px 8px 0;
|
||
padding: 16px 18px;
|
||
margin: 18px 0;
|
||
color: var(--c-fg);
|
||
font-size: 0.94rem;
|
||
}
|
||
|
||
/* ── Links ── */
|
||
.post__body a { color: var(--c-lime); text-decoration: underline; text-underline-offset: 3px; }
|
||
.post__body a:hover { color: var(--c-violet); }
|
||
|
||
/* ── Strong ── */
|
||
.post__body strong { color: var(--c-lime); font-weight: 600; }
|
||
|
||
/* ── HR ── */
|
||
.post__body hr {
|
||
border: none;
|
||
border-top: 1px solid var(--c-border);
|
||
margin: 32px 0;
|
||
}
|
||
|
||
/* ── Figures ── */
|
||
.post__body figure { margin: 20px 0; }
|
||
.post__body figure img {
|
||
max-width: 100%;
|
||
border-radius: 12px;
|
||
border: 1px solid var(--c-border);
|
||
}
|
||
|
||
.post__body figcaption {
|
||
color: var(--c-muted);
|
||
font-size: 0.85rem;
|
||
margin-top: 6px;
|
||
}
|
||
|
||
/* ── Lists ── */
|
||
.post__body ul, .post__body ol { padding-left: 1.5em; margin: 10px 0; }
|
||
.post__body li { margin: 4px 0; }
|
||
|
||
/* ── Author ── */
|
||
.post__author {
|
||
display: flex;
|
||
align-items: center;
|
||
gap: 14px;
|
||
margin-top: 48px;
|
||
padding-top: 24px;
|
||
border-top: 1px solid var(--c-border);
|
||
}
|
||
|
||
.post__author-avatar {
|
||
width: 48px;
|
||
height: 48px;
|
||
border-radius: 50%;
|
||
background: var(--c-violet);
|
||
color: #fff;
|
||
display: flex;
|
||
align-items: center;
|
||
justify-content: center;
|
||
font-weight: 700;
|
||
font-size: 0.85rem;
|
||
flex-shrink: 0;
|
||
}
|
||
|
||
.post__author-info {
|
||
font-size: 0.85rem;
|
||
color: var(--c-muted);
|
||
line-height: 1.4;
|
||
}
|
||
|
||
.post__author-name {
|
||
color: var(--c-fg);
|
||
font-weight: 600;
|
||
}
|
||
|
||
/* ── Analogy callout box ── */
|
||
.post__body .callout {
|
||
background: linear-gradient(135deg, rgba(124,92,255,0.06), rgba(182,255,60,0.04));
|
||
border: 1px solid rgba(124,92,255,0.15);
|
||
border-left: 4px solid var(--c-violet);
|
||
border-radius: 0 8px 8px 0;
|
||
padding: 18px 20px;
|
||
margin: 22px 0;
|
||
}
|
||
.post__body .callout--amber {
|
||
background: linear-gradient(135deg, rgba(182,255,60,0.06), rgba(124,92,255,0.04));
|
||
border-color: rgba(182,255,60,0.15);
|
||
border-left-color: var(--c-lime);
|
||
}
|
||
.post__body .callout--purple {
|
||
background: linear-gradient(135deg, rgba(124,92,255,0.07), rgba(182,255,60,0.04));
|
||
border-color: rgba(124,92,255,0.2);
|
||
border-left-color: var(--c-violet);
|
||
}
|
||
.post__body .callout__kicker {
|
||
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
|
||
font-size: 0.7rem;
|
||
letter-spacing: 0.18em;
|
||
text-transform: uppercase;
|
||
color: var(--c-violet);
|
||
margin-bottom: 8px;
|
||
display: block;
|
||
}
|
||
.post__body .callout--amber .callout__kicker { color: var(--c-lime); }
|
||
.post__body .callout--purple .callout__kicker { color: var(--c-violet); }
|
||
.post__body .callout p { margin: 6px 0 0; color: var(--c-fg); }
|
||
.post__body .callout p + p { margin-top: 10px; }
|
||
|
||
/* ── Gate badge pills ── */
|
||
.gate {
|
||
display: inline-block;
|
||
font-family: ui-monospace, 'SF Mono', 'Cascadia Code', Consolas, monospace;
|
||
font-size: 0.75rem;
|
||
font-weight: 600;
|
||
padding: 3px 10px;
|
||
border-radius: 5px;
|
||
margin-right: 4px;
|
||
}
|
||
.gate--build { background: rgba(124,92,255,0.12); color: #7C5CFF; border: 1px solid rgba(124,92,255,0.3); }
|
||
.gate--test { background: rgba(182,255,60,0.10); color: #B6FF3C; border: 1px solid rgba(182,255,60,0.25); }
|
||
.gate--behave { background: rgba(124,92,255,0.12); color: #7C5CFF; border: 1px solid rgba(124,92,255,0.3); }
|
||
.gate--feel { background: rgba(182,255,60,0.10); color: #B6FF3C; border: 1px solid rgba(182,255,60,0.25); }
|
||
.gate--visual { background: rgba(124,92,255,0.10); color: #8A95A3; border: 1px solid rgba(124,92,255,0.2); }
|
||
|
||
/* ── Section divider accent ── */
|
||
.post__body hr { border-color: var(--c-border); margin: 36px 0; }
|
||
.post__body hr.accent {
|
||
border: none;
|
||
height: 2px;
|
||
background: linear-gradient(90deg, transparent, var(--c-lime) 20%, var(--c-violet) 50%, var(--c-lime) 80%, transparent);
|
||
margin: 40px 0;
|
||
}
|
||
|
||
/* ── Two-column kitchen comparison ── */
|
||
.kitchen-grid {
|
||
display: grid;
|
||
grid-template-columns: 1fr 1fr;
|
||
gap: 16px;
|
||
margin: 18px 0;
|
||
}
|
||
@media (max-width: 640px) { .kitchen-grid { grid-template-columns: 1fr; } }
|
||
.kitchen-col {
|
||
background: var(--c-bg-raised);
|
||
border: 1px solid var(--c-border);
|
||
border-radius: 10px;
|
||
padding: 16px 18px;
|
||
}
|
||
.kitchen-col__title {
|
||
font-family: ui-monospace, 'SF Mono', 'Cascadia Code', Consolas, monospace;
|
||
font-size: 0.68rem;
|
||
letter-spacing: 0.16em;
|
||
text-transform: uppercase;
|
||
margin-bottom: 10px;
|
||
display: block;
|
||
}
|
||
.kitchen-col__title--kitchen { color: var(--c-lime); }
|
||
.kitchen-col__title--reality { color: var(--c-violet); }
|
||
.kitchen-col p { font-size: 0.9rem; color: #8A95A3; margin: 4px 0; }
|
||
.kitchen-col p strong { color: var(--c-fg); }
|
||
|
||
/* ── Table styles ── */
|
||
.post__body table {
|
||
width: 100%;
|
||
border-collapse: collapse;
|
||
margin: 18px 0;
|
||
font-size: 0.92rem;
|
||
}
|
||
.post__body th {
|
||
text-align: left;
|
||
border-bottom: 1px solid var(--c-border);
|
||
padding: 10px 12px;
|
||
color: var(--c-accent);
|
||
font-weight: 600;
|
||
}
|
||
.post__body td {
|
||
padding: 9px 12px;
|
||
border-bottom: 1px solid var(--c-border);
|
||
vertical-align: top;
|
||
}
|
||
</style>
|
||
</head>
|
||
<body>
|
||
|
||
<article class="post">
|
||
<a href="/blog/" class="post__back">← All Posts</a>
|
||
<span class="post__date">4 June 2026</span>
|
||
<h1 class="post__title">How Pi Agents Build, Test, and Ship Code with Oracle-Backed Flows</h1>
|
||
<p class="post__lead">Think of a restaurant kitchen during dinner rush. The head chef doesn't cook every dish. She runs the pass — each plate gets inspected before it leaves. One cook handles sauces, another pastry, another the grill. The expediter calls orders, coordinates timing, makes sure table 4's mains don't arrive before table 2's starters. A dish comes back? It goes straight to the station that messed up, with a ticket explaining exactly what's wrong. That kitchen runs on flows. So does our game engine.</p>
|
||
|
||
<div class="post__body">
|
||
|
||
<div class="callout">
|
||
<span class="callout__kicker">The Kitchen ↔ Flows Analogy</span>
|
||
<p><strong>The kitchen</strong> = Pi (the agent harness). <strong>The recipe</strong> = a JavaScript flow (<code>.flow.mjs</code>). <strong>The line cooks</strong> = agents (each with a station and tools). <strong>The pass</strong> = the flow engine (routes finished work). <strong>The head chef's inspection</strong> = the five gates. <strong>The order ticket</strong> = a spawn task or <code>tinqs flow run</code>. <strong>"Send it back!"</strong> = the fix loop.</p>
|
||
</div>
|
||
|
||
<h2>What Happens When You Spawn a Flow</h2>
|
||
<p>You run <code>tinqs flow run game-feature --task 'add a double-jump with cooldown'</code> or click Run Flow on the dashboard. The ticket hits the kitchen. What follows is not one agent doing everything — it's a brigade running their stations.</p>
|
||
|
||
<figure style="margin:28px 0;">
|
||
<svg viewBox="0 0 920 350" role="img" aria-label="The verify-heavy flow: context, plan, implement, five gates, a Reflexion loop, and one judge" style="width:100%;height:auto;display:block;background:#0B0C0E;border:1px solid rgba(255,255,255,0.07);border-radius:12px;font-family:'JetBrains Mono',ui-monospace,monospace;">
|
||
<defs>
|
||
<marker id="ah" markerWidth="10" markerHeight="10" refX="7" refY="3.2" orient="auto"><path d="M0,0 L7,3.2 L0,6.4 Z" fill="#8A95A3"/></marker>
|
||
<marker id="ahA" markerWidth="10" markerHeight="10" refX="7" refY="3.2" orient="auto"><path d="M0,0 L7,3.2 L0,6.4 Z" fill="#B6FF3C"/></marker>
|
||
</defs>
|
||
<rect x="40" y="40" width="140" height="46" rx="9" fill="#15171A" stroke="rgba(255,255,255,0.07)"/>
|
||
<text x="110" y="68" text-anchor="middle" fill="#8A95A3" font-size="15">Context</text>
|
||
<rect x="210" y="40" width="140" height="46" rx="9" fill="#15171A" stroke="rgba(255,255,255,0.07)"/>
|
||
<text x="280" y="68" text-anchor="middle" fill="#8A95A3" font-size="15">Plan</text>
|
||
<rect x="400" y="40" width="150" height="46" rx="9" fill="#15171A" stroke="rgba(255,255,255,0.07)"/>
|
||
<text x="475" y="68" text-anchor="middle" fill="#ECEEF1" font-size="15">Implement</text>
|
||
<line x1="180" y1="63" x2="206" y2="63" stroke="#8A95A3" stroke-width="1.6" marker-end="url(#ah)"/>
|
||
<line x1="350" y1="63" x2="396" y2="63" stroke="#8A95A3" stroke-width="1.6" marker-end="url(#ah)"/>
|
||
<rect x="40" y="150" width="840" height="82" rx="12" fill="#15171A" stroke="rgba(255,255,255,0.07)"/>
|
||
<text x="56" y="171" fill="#8A95A3" font-size="11" letter-spacing="1.4">VERIFY-HEAVY GATES — most compute is spent checking, not writing</text>
|
||
<rect x="56" y="180" width="148" height="42" rx="8" fill="#15171A" stroke="#7C5CFF" stroke-opacity="0.55"/>
|
||
<text x="130" y="206" text-anchor="middle" fill="#7C5CFF" font-size="13.5">G1 · Build</text>
|
||
<rect x="222" y="180" width="148" height="42" rx="8" fill="#15171A" stroke="#B6FF3C" stroke-opacity="0.55"/>
|
||
<text x="296" y="206" text-anchor="middle" fill="#B6FF3C" font-size="13.5">G2 · Tests</text>
|
||
<rect x="388" y="180" width="148" height="42" rx="8" fill="#15171A" stroke="#7C5CFF" stroke-opacity="0.55"/>
|
||
<text x="462" y="206" text-anchor="middle" fill="#B6FF3C" font-size="13.5">G3 · Behaviour</text>
|
||
<rect x="554" y="180" width="148" height="42" rx="8" fill="#15171A" stroke="#B6FF3C" stroke-opacity="0.55"/>
|
||
<text x="628" y="206" text-anchor="middle" fill="#B6FF3C" font-size="13.5">G4 · Feel</text>
|
||
<rect x="720" y="180" width="148" height="42" rx="8" fill="#15171A" stroke="rgba(255,255,255,0.15)" stroke-opacity="0.55"/>
|
||
<text x="794" y="206" text-anchor="middle" fill="#8A95A3" font-size="13.5">G5 · Visual</text>
|
||
<line x1="475" y1="86" x2="475" y2="148" stroke="#8A95A3" stroke-width="1.6" marker-end="url(#ah)"/>
|
||
<line x1="460" y1="232" x2="460" y2="276" stroke="#8A95A3" stroke-width="1.6" marker-end="url(#ah)"/>
|
||
<text x="472" y="258" fill="#8A95A3" font-size="11">all green ⇒ done · any fail ⇒ report</text>
|
||
<rect x="380" y="278" width="160" height="46" rx="9" fill="#15171A" stroke="#7C5CFF"/>
|
||
<text x="460" y="306" text-anchor="middle" fill="#7C5CFF" font-size="15">Judge — honest verdict</text>
|
||
<path d="M820,150 C 908,96 716,50 556,61" fill="none" stroke="#B6FF3C" stroke-width="1.8" stroke-dasharray="6 5" marker-end="url(#ahA)"/>
|
||
<text x="694" y="96" fill="#B6FF3C" font-size="12.5">Reflexion · fix & retry ≤ 3</text>
|
||
</svg>
|
||
<figcaption style="color:#8A95A3;font-size:0.85rem;margin-top:8px;">A real failure loops back to <em>implement</em> with gate evidence (bounded to three tries); anything green falls through to the judge.</figcaption>
|
||
</figure>
|
||
|
||
<h2>The Five Gates: What the Head Chef Checks</h2>
|
||
<p>In a kitchen, the head chef doesn't trust — she verifies. Every plate hits the pass and gets inspected. Our flows have the same instinct. Each gate is a sub-agent with one job, one tool, and absolute veto power.</p>
|
||
|
||
<div class="kitchen-grid">
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--kitchen">In the Kitchen</span>
|
||
<p><strong>Check the base.</strong> Is the protein cooked through? If the chicken is raw, the whole plate stops here. Nothing else matters.</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--reality">In the Flow</span>
|
||
<p><span class="gate gate--build">G1 · Build</span> Runs <code>dotnet build</code>. PASS/FAIL with file:line errors. Won't compile? Nothing proceeds.</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--kitchen">In the Kitchen</span>
|
||
<p><strong>Taste the sauce.</strong> Seasoning right? Acid balanced? The dish might look perfect but taste flat.</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--reality">In the Flow</span>
|
||
<p><span class="gate gate--test">G2 · Tests</span> Runs <code>dotnet test</code>. Parses which assertions broke. Fixed code that passes build but fails logic gets caught here.</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--kitchen">In the Kitchen</span>
|
||
<p><strong>Does it work?</strong> Pick it up. Does the sauce hold? Does the plating survive the walk to table 6?</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--reality">In the Flow</span>
|
||
<p><span class="gate gate--behave">G3 · Behaviour</span> Sends <code>{"jump":true}</code> to the LIVE game. Samples the player body 30 times at 50ms. Did the character actually jump? Double-jump fire? This is the ground-truth oracle — what makes game dev fundamentally different from web dev.</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--kitchen">In the Kitchen</span>
|
||
<p><strong>How does it feel?</strong> The steak is cooked but chewy. The sauce is seasoned but gloopy. Edible ≠ good.</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--reality">In the Flow</span>
|
||
<p><span class="gate gate--feel">G4 · Feel</span> Measures apex height, airtime, liftoff latency, rise/fall asymmetry, landing settle. Numeric thresholds. A jump that works but takes 400ms to lift off fails. Behaviour says it happened. Feel says it felt good.</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--kitchen">In the Kitchen</span>
|
||
<p><strong>How does it look?</strong> Is the garnish wilting? Sauce smeared? Does it match the menu photo?</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--reality">In the Flow</span>
|
||
<p><span class="gate gate--visual">G5 · Visual</span> Captures 8 frames at 100ms intervals, grids them, feeds to <code>gemini-2.5-flash</code>. Checks: T-pose? Foot-slide? Frozen animation? Wrong clip? Missing transitions?</p>
|
||
</div>
|
||
</div>
|
||
|
||
<div class="callout callout--amber">
|
||
<span class="callout__kicker">The Loop</span>
|
||
<p>Any red gate → evidence sent back to the cook → fix → re-enter the inspection line. Three chances max, then the head chef escalates to a human. This is the same instinct that makes a good kitchen work: catch it early, send it back with a clear note, give them a chance to fix it, but don't let the same dish circle the pass forever.</p>
|
||
</div>
|
||
|
||
<h2>Composability: Adding a New Station</h2>
|
||
<p>A kitchen doesn't redesign the whole line when they add a new dish. They add a station. Same in flows. Started with three gates — build, test, vision. Behaviour and feel came later, each a single-file extension. Gates aren't hardcoded. They're sub-agents called from JavaScript flows. Want a linting gate? Add an <code>agent()</code> call with a linter. Security scan? Same pattern. Asset bundle size check? Write the tool, declare the agent, wire it in.</p>
|
||
|
||
<div class="callout callout--purple">
|
||
<span class="callout__kicker">Self-Improving Kitchen</span>
|
||
<p>Agents can extend the flow at runtime. If the behaviour gate keeps failing because the game window isn't focused, an agent notices the pattern and inserts a pre-condition gate that checks window focus. The flow engine handles routing; the agents handle decisions. This is what makes flows fundamentally different from a script — the pipeline isn't fixed at compile time. It's a graph that agents read, understand, and modify while they run.</p>
|
||
</div>
|
||
|
||
<hr class="accent">
|
||
|
||
<h2>The CI Loop: The Dish That Came Back After It Left</h2>
|
||
<p>Gates inspect plates at the pass. But what about after the plate leaves the kitchen? What about the customer who finds a hair in their soup after it's been served?</p>
|
||
|
||
<p>Most coding agents don't care. They write code, push, walk away. A human discovers the broken CI build an hour later. That's the equivalent of a cook plating a dish, sending it out, and never checking if the diner is still alive.</p>
|
||
|
||
<p>We closed this loop with three tools — the waiter who brings the plate back:</p>
|
||
|
||
<ul>
|
||
<li><strong>ci_wait</strong> — stands by the table, polls every 15 seconds until the diner finishes</li>
|
||
<li><strong>ci_status</strong> — checks: did they enjoy it or send it back?</li>
|
||
<li><strong>ci_logs</strong> — reads the complaint card: exactly what was wrong</li>
|
||
</ul>
|
||
|
||
<p>The agent pushes, calls <code>ci_wait</code>. If CI fails, it reads <code>ci_logs</code>, fixes the exact error, pushes again. DeepSeek V4 parses compiler errors the way a cook reads a ticket: "missing import" = forgot the salt, "type mismatch" = wrong pan size, "module not found" = ingredient not in stock. Pattern-matched and fixed in seconds.</p>
|
||
|
||
<div class="callout callout--amber">
|
||
<span class="callout__kicker">Real Example</span>
|
||
<p>Adding a health check endpoint to a Go service. Agent wrote the handler and test, pushed. CI failed — the test imported a package that didn't exist on the runner. Agent read <code>ci_logs</code>, saw <code>go: module not found</code>, added the missing <code>go.mod</code> replace directive, pushed again. CI passed. PR opened. <strong>4 minutes. $0.06.</strong></p>
|
||
</div>
|
||
|
||
<p>Three safeguards prevent the kitchen grinding to a halt: <strong>retry limit</strong> (3, same dish doesn't circle forever), <strong>diff budget</strong> (retries only touch files already on the ticket), and <strong>hallucination detection</strong> (if the cook claims the customer loved it without actually asking the waiter, they get corrected).</p>
|
||
|
||
<h2>The Numbers</h2>
|
||
<p>Over three weeks of running the orchestrator:</p>
|
||
|
||
<ul>
|
||
<li><strong>87 tasks</strong> completed end-to-end</li>
|
||
<li><strong>23 tasks</strong> needed at least one CI retry (26%)</li>
|
||
<li><strong>19 of those 23</strong> resolved on the first retry</li>
|
||
<li><strong>4 tasks</strong> hit the retry limit and escalated to a human</li>
|
||
<li><strong>0 tasks</strong> produced a merged PR that later broke something else</li>
|
||
</ul>
|
||
|
||
<p>The 26% retry rate matches what you'd see from a junior developer. The difference: the agent fixes it in 30 seconds.</p>
|
||
|
||
<h2>The Architecture</h2>
|
||
|
||
<table style="width:100%;border-collapse:collapse;margin:18px 0;font-size:0.92rem;">
|
||
<thead>
|
||
<tr style="text-align:left;border-bottom:1px solid rgba(255,255,255,0.07);">
|
||
<th style="padding:10px 12px;color:#B6FF3C;font-weight:600;">Layer</th>
|
||
<th style="padding:10px 12px;color:#B6FF3C;font-weight:600;">What</th>
|
||
<th style="padding:10px 12px;color:#B6FF3C;font-weight:600;">How</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:9px 12px;color:#ECEEF1;vertical-align:top;"><strong style="color:#B6FF3C;">Flow engine</strong></td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">pi-flows orchestrator</td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">Composes agents, gates and decision points</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:9px 12px;color:#ECEEF1;vertical-align:top;"><strong style="color:#B6FF3C;">Oracle gates</strong></td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">verify_build, drive_game, game_frames</td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">Return structured PASS/FAIL with evidence</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:9px 12px;color:#ECEEF1;vertical-align:top;"><strong style="color:#B6FF3C;">Sub-agents</strong></td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">G1 build · G2 tests · G3 behaviour · G4 feel · G5 visual</td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">Role-split, each with its own toolset</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:9px 12px;color:#ECEEF1;vertical-align:top;"><strong style="color:#B6FF3C;">CI loop</strong></td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">tinqs-ci extension</td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">ci_status, ci_logs, ci_wait — polls Gitea Actions, reads logs, retries</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:9px 12px;color:#ECEEF1;vertical-align:top;"><strong style="color:#B6FF3C;">Decision</strong></td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">Agent-loop Reflexion</td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">Self-reflect on failures, retry (≤3) or escalate</td></tr>
|
||
<tr><td style="padding:9px 12px;color:#ECEEF1;vertical-align:top;"><strong style="color:#B6FF3C;">Visualization</strong></td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">FlowDashboard</td><td style="padding:9px 12px;color:#8A95A3;vertical-align:top;">Real-time pipeline state</td></tr>
|
||
</tbody>
|
||
</table>
|
||
|
||
<hr>
|
||
|
||
<h2>Three Kitchens, One Morning</h2>
|
||
<p>This morning, I ran three flows. Each is a different kitchen, a different brigade, a different dish. Here's what actually happened — real flow logs, real verdicts, nothing staged.</p>
|
||
|
||
<div class="callout callout--amber">
|
||
<span class="callout__kicker">Flow 1 · 4 June, 18:32</span>
|
||
<p><strong>deep-implement</strong> — "Build the tinqs-gitea-read extension: list_org_repos, read_repo_file, list_repo_dir, search_repos." Nine steps, 14 minutes. Verdict: <span class="gate gate--test">PASS</span>. 31/31 vitest tests green, zero new TypeScript errors, session-level caching, path traversal protection. Every <code>execute()</code> body fully wired — no stubs, no placeholders. Like a saucier who doesn't just list ingredients but actually makes the sauce.</p>
|
||
</div>
|
||
|
||
<div class="callout callout--purple">
|
||
<span class="callout__kicker">Flow 2 · 4 June, 19:04</span>
|
||
<p><strong>game-feature</strong> — "Make the player jump." Build: <span class="gate gate--build">PASS</span>. Tests: <span class="gate gate--test">PASS</span>. Behaviour/Feel/Visual: <span style="color:#B6FF3C;">NOT RUN</span> — no live game instance was reachable. The flow didn't silently skip the visual gate. It <strong>hard-stopped</strong> and reported honestly: "FAIL — the feature has not been verified in-game." This is the kitchen saying: "The dish is cooked, but nobody tasted it. I'm not sending it out."</p>
|
||
</div>
|
||
|
||
<div class="callout callout--amber">
|
||
<span class="callout__kicker">Flow 3 · 4 June, 19:49</span>
|
||
<p><strong>cto-infra</strong> — "Synthesize cost, stability, and VCS research into an AWS architecture decision." Four research streams fed into one CTO agent. Output: 14 requirements mapped to specific decisions, cost-vs-stability tradeoffs resolved with dollar figures, EC2+EBS over Fargate+EFS, RDS Multi-AZ mandatory, S3+CloudFront for LFS. Like an executive chef reading four menu proposals, reconciling them into one service, and pricing every plate.</p>
|
||
</div>
|
||
|
||
<hr class="accent">
|
||
|
||
<h2>Dinner Rush Recovery: The Crash That Interrupted Service</h2>
|
||
<p>Earlier today, a machine crash cut off a flow mid-stream — the kitchen lost power during dinner rush. Nineteen tests were left red. Contracts written, implementation half-done. Half-cooked dishes on every station.</p>
|
||
|
||
<p>I spawned the same flow with a different task:</p>
|
||
|
||
<pre><code>tinqs flow run game-feature --task 'Finish the leftover jump & locomotion animation work -- make the 19 FAILING tests GREEN.'</code></pre>
|
||
|
||
<p>What happened next: the team picked up exactly where the crash left off. Here's the recipe — the exact JavaScript that runs in production:</p>
|
||
|
||
<pre><code>// .pi/flows/flows/game-feature.flow.mjs
|
||
export const meta = {
|
||
name: "game-feature",
|
||
description: "Build a PLAYABLE game feature and prove it in the LIVE game.",
|
||
task_required: true
|
||
};
|
||
|
||
export default async function run({ task, flow }) {
|
||
// G0: Pre-flight — validate vision CAN run before any build work
|
||
await flow.agent("vision-preflight", {
|
||
task: "Check GEMINI_API_KEY is set AND game_frames reaches a live instance."
|
||
});
|
||
|
||
// Context + plan
|
||
const context = await flow.agent("project-context-reader");
|
||
const plan = await flow.agent("feature-planner", { context });
|
||
|
||
// TDD: write tests FIRST (different agent than implementer)
|
||
const testSuite = await flow.agent("test-author", { plan });
|
||
|
||
// Implement
|
||
const source = await flow.agent("game-builder", { testSuite, plan });
|
||
|
||
// G1–G5: Oracle gates run via parallel for speed
|
||
const gates = await flow.parallel([
|
||
flow.agent("build-verifier", { source }),
|
||
flow.agent("test-runner", { source }),
|
||
flow.agent("behavioral-prober", { source }),
|
||
flow.agent("feel-judge", { source }),
|
||
flow.agent("animation-vision-judge", { source })
|
||
]);
|
||
|
||
// Self-recurring fix-loop: bounded loop back to implement with evidence
|
||
const MAX_RETRIES = 3;
|
||
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
|
||
const decision = await flow.agent("flow-decision", { gates });
|
||
if (decision.verdict === "pass") break;
|
||
if (attempt === MAX_RETRIES) {
|
||
const fixed = await flow.agent("game-builder", { source, failures: decision.evidence });
|
||
}
|
||
}
|
||
|
||
// Final judge: one honest verdict
|
||
return flow.agent("game-judge");
|
||
}</code></pre>
|
||
|
||
<p>Eight logical steps, seven cooks, five inspection points, one head chef. Triggered by a single spawn.</p>
|
||
|
||
<p>Here's how the brigade actually worked. The <strong>vision-preflight</strong> agent — the chef who checks the gas is on before anyone starts cooking — verified <code>GEMINI_API_KEY</code> was set and <code>game_frames</code> could reach the live game. Both green in under a second. Without this, the whole kitchen would prep for an hour only to discover the oven doesn't work.</p>
|
||
|
||
<p>The <strong>project-context-reader</strong> — the commis who reads the entire recipe book — ingested <code>PlayerController.cs</code>, <code>PlayerAnimController.cs</code>, <code>PlayerAnimationLogic.cs</code>, the test files, the manifest. The <strong>feature-planner</strong> — the sous-chef who breaks down the order into station tasks — decomposed 19 failures into four fix groups: vegetation manifest (146 broken <code>prefabPath</code> items), animation controller (crouch parameter not plumbed), jump physics (coyote time, variable height, air control — all missing), and animation tree (entire state machine absent).</p>
|
||
|
||
<p>Then the <strong>game-builder</strong> — the line cook at the hot station — read each test failure like a dish ticket, traced it to the source, and started cooking. Coyote time: 100ms grace period after feet leave the ground. Variable jump height: velocity scaled by hold duration, tap gives 3.5, full hold gives 6.5. Air control: horizontal speed cut 40% while airborne. Jump phases: minimum 0.15s on jump_start before transitioning up. Landing timer: wait the full animation length, not length-minus-blend. Animation tree: <code>jump_start → jump → jump_land</code> states with 0.1s blends.</p>
|
||
|
||
<p>Then the inspection line: <strong>build-verifier</strong> compiled. <strong>Test-runner</strong> ran the suite. <strong>Behavioral-prober</strong> sent <code>{"jump":true}</code> to the live game and sampled the player body. <strong>Feel-judge</strong> measured apex, airtime, liftoff latency. <strong>Animation-vision-judge</strong> captured 8 frames, gridded them, had <code>gemini-2.5-flash</code> scan for T-poses and foot-slide.</p>
|
||
|
||
<p>Anything red → ticket back to the cook with the specific failure → fix → re-enter the line. Bounded to 3 returns. Anything green → falls through. All green → <strong>game-judge</strong> gives the final verdict.</p>
|
||
|
||
<div class="callout">
|
||
<span class="callout__kicker">Not a Demo</span>
|
||
<p>This flow is a file at <code>.pi/flows/flows/game-feature.flow.mjs</code>. I trigger it by running <code>tinqs flow run game-feature</code> or clicking Run Flow on the dashboard. It dispatches agents, runs gates, loops on failures, reports a verdict. The dashboard at <code>:33634</code> is the control plane — spawn, steer mid-run, inspect state. That's the whole product.</p>
|
||
</div>
|
||
|
||
<hr class="accent">
|
||
|
||
<h2>The Menu: Flows at Your Fingertips</h2>
|
||
<p>Every flow lives in <code>.pi/flows/flows/*.flow.mjs</code> and is spawnable by name. You run <code>tinqs flow run <name> [task]</code> or click Run Flow on the dashboard.</p>
|
||
|
||
<p>"Add wall-running" becomes the task argument. The flow reads it, wires it through the agents, routes it through the gates. The JavaScript is the recipe. The conversation provides the context.</p>
|
||
|
||
<p>The menu I call from daily:</p>
|
||
|
||
<ul>
|
||
<li><strong>game-feature</strong> — "add a double-jump" or "fix the 19 red tests" → brigade assembles, cooks, inspects, plates</li>
|
||
<li><strong>deep-implement</strong> — "build the gitea-read extension" → research → plan → implement → test → review → judge</li>
|
||
<li><strong>cto-infra</strong> — "reconcile cost, stability, and VCS research into architecture decisions" → 4 research streams → 1 synthesis agent → 14 requirements mapped to decisions</li>
|
||
<li><strong>flows:new</strong> — "I need a flow that..." → the Flow Architect reads the agent catalog, selects cooks, designs the recipe, writes the <code>.flow.mjs</code></li>
|
||
</ul>
|
||
|
||
<h2>The Pass: How Agents Hand Off Work</h2>
|
||
<p>In a real kitchen, cooks don't shout instructions across the room. They place finished plates on the pass. The expediter reads the ticket, checks the plate, routes it to the next station or to the dining room. Nobody yells. Nobody grabs someone else's pan.</p>
|
||
|
||
<p>Flows work the same way. Agents never talk to each other directly. When the game-builder finishes, it returns a result object — placing its work on the pass. The flow engine — the expediter — records it and routes it. The next agent receives the return value directly from <code>await flow.agent("game-builder")</code>.</p>
|
||
|
||
<div class="kitchen-grid">
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--kitchen">What People Expect</span>
|
||
<p>Agents chatting freely, PM-slack style: "Hey test-runner, I just pushed some code, can you check it? Also the jump feels off, maybe tune the velocity?"</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--reality">What Actually Happens</span>
|
||
<p>Agent A returns <code>{ verdict: "pass", findings: ["coyote_time=100ms"] }</code> → flow engine records it → Agent B receives the result as a direct return value of <code>await flow.agent("A")</code>. No chatter. Structured handoff.</p>
|
||
</div>
|
||
</div>
|
||
|
||
<p>Why? Because unstructured chatter is how hallucination cascades start. Agent A confidently states something wrong. Agent B builds on it. Agent C compounds it. Three agents later, they're collectively wrong about a file that doesn't exist, and nobody can trace where the error came from. The pass — structured result-passing via typed return values from each <code>agent()</code> call — makes every handoff auditable, verifiable, and debuggable.</p>
|
||
|
||
<p>Pi itself is built for solo interactive work: you ask, it does, you review. The orchestration layer I wrote on top inverts that. Pi becomes the kitchen. The flow engine becomes the expediter. Agents become line cooks who place plates on the pass, never shouting across the room.</p>
|
||
|
||
<h2>The Setup: Extensions, Agents, and 15–20 Flows</h2>
|
||
<p>"How did you set this up?" is the question I get most often. Here's the honest answer: there's no dashboard with drag-and-drop. You write three kinds of files.</p>
|
||
|
||
<p><strong style="color:#B6FF3C;">Extensions</strong> are TypeScript tools that agents call. Each is about 300 lines, MIT licensed:</p>
|
||
|
||
<table style="width:100%;border-collapse:collapse;margin:18px 0;font-size:0.89rem;">
|
||
<thead>
|
||
<tr style="text-align:left;border-bottom:1px solid rgba(255,255,255,0.07);">
|
||
<th style="padding:8px 12px;color:#B6FF3C;">Extension</th>
|
||
<th style="padding:8px 12px;color:#B6FF3C;">What agents call it for</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>verify_build</code></td><td style="padding:7px 12px;color:#8A95A3;">Compile the game + sim, return file:line errors</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>drive_game</code></td><td style="padding:7px 12px;color:#8A95A3;">Send input to the live game, sample player body</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>game_frames</code></td><td style="padding:7px 12px;color:#8A95A3;">Capture screenshot sequences for vision judging</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>ci_status</code></td><td style="padding:7px 12px;color:#8A95A3;">Check Gitea Actions pipeline state for a branch</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>ci_logs</code></td><td style="padding:7px 12px;color:#8A95A3;">Fetch full build log from the most recent failed run</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>ci_wait</code></td><td style="padding:7px 12px;color:#8A95A3;">Poll every 15 seconds until the pipeline finishes</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>gen_image</code></td><td style="padding:7px 12px;color:#8A95A3;">Generate brand/marketing images via fal.ai flux-2-pro</td></tr>
|
||
<tr><td style="padding:7px 12px;color:#ECEEF1;"><code>agent_catalog</code></td><td style="padding:7px 12px;color:#8A95A3;">List available agents with their tools, inputs, outputs</td></tr>
|
||
</tbody>
|
||
</table>
|
||
|
||
<p><strong style="color:#B6FF3C;">Agents</strong> are Markdown files with YAML frontmatter. Each declares its role, model tier, tools, inputs, and outputs:</p>
|
||
|
||
<pre><code>---
|
||
name: game-builder
|
||
description: Implements game features in C# (Godot)
|
||
model: @coding
|
||
tools: read, write, edit, bash, verify_build, drive_game
|
||
inputs: [context, plan, build_fail, behaviour_fail, feel_fail, visual_fail]
|
||
outputs: [summary, files]
|
||
---
|
||
You are a game developer. Task: ${{task}}
|
||
Context: ${{input.context}}</code></pre>
|
||
|
||
<p><strong style="color:#B6FF3C;">Flows</strong> are JavaScript modules (<code>.flow.mjs</code>) that coordinate agents with real control flow. I have about <strong>15–20 flows</strong> running across different domains:</p>
|
||
|
||
<ul>
|
||
<li><strong>Game dev:</strong> game-feature, review, bug-hunt, refactor</li>
|
||
<li><strong>Design:</strong> concept-art, sound-design (plans → ElevenLabs generation → judge evaluates with other models)</li>
|
||
<li><strong>Marketing:</strong> brand-image, trailer-clip (Sora 2 video generation → vision judge)</li>
|
||
<li><strong>Infra:</strong> ci-fix, deploy-check, tinqs-jobs (action runners on AWS Lambda, workspace management)</li>
|
||
<li><strong>Meta:</strong> A flow that periodically reads and improves the other flows — yes, flows that edit flows</li>
|
||
</ul>
|
||
|
||
<p>The setup is not a product you install. It's a stack: Pi as the agent harness, custom extensions as the tool layer, markdown agents as the role layer, JavaScript flows as the orchestration layer. The whole thing lives in <code>.pi/flows/</code>. Version-controlled. CI-tested. Spawned via <code>tinqs flow run</code> or the dashboard.</p>
|
||
|
||
<h2>The Recipe vs. The Technique</h2>
|
||
<p>"Do you define the process with these trees, or do the agents freestyle?" Both. The recipe says what to make and in what order. The technique is how each cook executes their station.</p>
|
||
|
||
<div class="kitchen-grid">
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--kitchen">The Recipe (Rigid)</span>
|
||
<p>The flow's JavaScript is the recipe. It says: first the prep cook dices onions, then the saucier makes the base, then the grill cook sears the protein. After every station, the plate hits the pass for inspection. <strong>This order is not negotiable.</strong> A cook cannot skip the inspection because they feel confident. The inspection runs. Period.</p>
|
||
</div>
|
||
<div class="kitchen-col">
|
||
<span class="kitchen-col__title kitchen-col__title--reality">The Technique (Autonomous)</span>
|
||
<p>Inside their station, a cook has full agency. How they dice the onions — brunoise or rough chop — is their call. Which pan they use, how they adjust the heat, whether they taste midway. The game-builder decides which files to read, which approach to take. Nobody tells it "edit line 247." It figures that out with <code>grep</code>, <code>find</code>, and reading code.</p>
|
||
</div>
|
||
</div>
|
||
|
||
<p>This balance is everything. Too much recipe → agents can't handle surprises. Too much freestyle → agents hallucinate, skip checks, ship broken code. The recipe guarantees the right things happen in the right order — preflight before build, build before test, test before ship. The technique handles the messy, unpredictable reality of actual code.</p>
|
||
|
||
<div class="callout callout--purple">
|
||
<span class="callout__kicker">The Meta-Kitchen</span>
|
||
<p>And when a recipe is wrong? Another flow improves it. A meta-flow reads performance data, spots bottlenecks — "the feel gate keeps failing because the cook doesn't know the jump velocity threshold" — edits the <code>.flow.mjs</code> to pass that threshold into the builder's inputs, and commits the change. <strong>Flows that edit flows.</strong> The kitchen that renovates itself between services.</p>
|
||
</div>
|
||
|
||
<hr class="accent">
|
||
|
||
<h2>Picking the Right Knife: Model Strategy</h2>
|
||
<p>You don't use a paring knife to butcher a cow. You don't use a cleaver to supreme an orange. Different work needs different blades. Flows use <strong>role-based model tiers</strong> — each agent declares the blade it needs, and the engine hands it the right one at dispatch time.</p>
|
||
|
||
<table style="width:100%;border-collapse:collapse;margin:18px 0;font-size:0.89rem;">
|
||
<thead>
|
||
<tr style="text-align:left;border-bottom:1px solid rgba(255,255,255,0.07);">
|
||
<th style="padding:8px 12px;color:#B6FF3C;">Tier</th>
|
||
<th style="padding:8px 12px;color:#B6FF3C;">The Knife</th>
|
||
<th style="padding:8px 12px;color:#B6FF3C;">What It Cuts</th>
|
||
</tr>
|
||
</thead>
|
||
<tbody>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>@coding</code></td><td style="padding:7px 12px;color:#B6FF3C;">DeepSeek V4</td><td style="padding:7px 12px;color:#8A95A3;"><strong>Chef's knife</strong> — your workhorse. Reads 800-line files, writes 200-line diffs. Game-builder, fixer, test-author. Free.</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>@planning</code></td><td style="padding:7px 12px;color:#B6FF3C;">DeepSeek V4</td><td style="padding:7px 12px;color:#8A95A3;"><strong>Boning knife</strong> — precision decomposition. Breaks tasks into steps, designs DAGs. Flow architect, feature planner.</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>@fast</code></td><td style="padding:7px 12px;color:#B6FF3C;">DeepSeek V4 Flash</td><td style="padding:7px 12px;color:#8A95A3;"><strong>Paring knife</strong> — quick, decisive cuts. Gate pass/fail, fork choices, loop exits. No overthinking.</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>@research</code></td><td style="padding:7px 12px;color:#B6FF3C;">DeepSeek V4</td><td style="padding:7px 12px;color:#8A95A3;"><strong>Fillet knife</strong> — flexible, follows contours. Reads codebase, traces patterns, finds what matters.</td></tr>
|
||
<tr style="border-bottom:1px solid rgba(255,255,255,0.07);"><td style="padding:7px 12px;color:#ECEEF1;"><code>@vision</code></td><td style="padding:7px 12px;color:#7C5CFF;">Gemini 2.5 Flash</td><td style="padding:7px 12px;color:#8A95A3;"><strong>The inspector's eyes</strong> — the only knife that sees. Multimodal frame judging: T-poses, foot-slide, frozen anims.</td></tr>
|
||
<tr><td style="padding:7px 12px;color:#ECEEF1;"><code>@compact</code></td><td style="padding:7px 12px;color:#B6FF3C;">DeepSeek V4 Flash</td><td style="padding:7px 12px;color:#8A95A3;"><strong>Kitchen shears</strong> — lightweight, versatile. Summaries, verdicts, post-processing. Fast and cheap.</td></tr>
|
||
</tbody>
|
||
</table>
|
||
|
||
<div class="callout callout--amber">
|
||
<span class="callout__kicker">Why DeepSeek?</span>
|
||
<p>Two reasons. <strong>It's free</strong> — no usage limits, which matters when your game-builder reads 800-line files and writes 200-line diffs ten times a session. <strong>It's genuinely good at C# and Godot</strong> — I've had it write a full lighting module for our Godot fork by reading Unity API docs and adapting patterns. No agent had pulled that off before. DeepSeek can't do multimodal, so vision goes to Gemini — but for everything else, it's the chef's knife you reach for 90% of the time.</p>
|
||
</div>
|
||
|
||
<p>The point of the knife rack: you configure this <strong>once</strong>. Every agent declares <code>model: @coding</code> and gets DeepSeek V4 automatically. Swap models globally without touching any flow or agent file. The right blade, every time, no thinking required.</p>
|
||
|
||
<hr class="accent">
|
||
|
||
<p>The oracle tools — <code>verify_build</code>, <code>drive_game</code>, <code>game_frames</code> — are the durable assets. About 300 lines of TypeScript each, MIT licensed, reusable in any Pi project. The flow engine composes them; the agents route through them.</p>
|
||
|
||
<p>A year ago we had a supervisor written in 1,050 lines of hardcoded TypeScript that did one thing: verify agent output compiled and passed tests. We deleted it. The same verification now runs as a composable flow with five gates, live-game testing, and CI integration. Sometimes the best architecture decision is knowing what to delete.</p>
|
||
|
||
<p><em>The flow-native brain runs on our <a href="https://tinqs.com/tinqs/pi">Pi fork</a> inside <a href="https://tinqs.com">Tinqs Studio</a>. The oracle extensions are MIT licensed and reusable in any Pi project.</em></p>
|
||
|
||
</div>
|
||
|
||
<div class="post__author">
|
||
<div class="post__author-avatar">OB</div>
|
||
<div class="post__author-info">
|
||
<span class="post__author-name">Ozan Bozkurt</span><br>
|
||
CTO & Developer, Tinqs
|
||
</div>
|
||
</div>
|
||
</article>
|
||
|
||
</body>
|
||
</html>
|