Compare commits
2 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 074fa08bb7 | |||
| 69d74be6ba |
@@ -163,7 +163,7 @@
|
|||||||
<a href="pi-flow-native-brain" class="blog-card">
|
<a href="pi-flow-native-brain" class="blog-card">
|
||||||
<span class="blog-card__date">4 June 2026</span>
|
<span class="blog-card__date">4 June 2026</span>
|
||||||
<h2 class="blog-card__title">How Pi Agents Build, Test, and Ship Game Code with Oracle-Backed Flows</h2>
|
<h2 class="blog-card__title">How Pi Agents Build, Test, and Ship Game Code with Oracle-Backed Flows</h2>
|
||||||
<p class="blog-card__excerpt">A flow spawns, agents fan out through five oracle gates, the game-builder fixes 19 red tests while vision judges check the live game — and it all runs as one autonomous flow.</p>
|
<p class="blog-card__excerpt">We type a slash command, agents fan out through five oracle gates, the game-builder fixes 19 red tests while vision judges check the live game — and it all runs as one autonomous flow.</p>
|
||||||
<span class="blog-card__read">Read →</span>
|
<span class="blog-card__read">Read →</span>
|
||||||
</a>
|
</a>
|
||||||
{{CARDS}}
|
{{CARDS}}
|
||||||
|
|||||||
|
Before
After
|
+2
-2
@@ -271,7 +271,7 @@
|
|||||||
<p><strong>Identity.</strong> Who the agent is, what it values, how it should behave. Not "you are a helpful assistant" — that's generic and unmoored. A soul file that says "you're working on Ariki, a survival colony sim. The team is four people. Never push to main without review. Prefer existing conventions." Identity creates consistency across sessions.</p>
|
<p><strong>Identity.</strong> Who the agent is, what it values, how it should behave. Not "you are a helpful assistant" — that's generic and unmoored. A soul file that says "you're working on Ariki, a survival colony sim. The team is four people. Never push to main without review. Prefer existing conventions." Identity creates consistency across sessions.</p>
|
||||||
<p><strong>Memory.</strong> What happened last session. What decisions were made. What failed and why. Without memory, every conversation is a cold start — "let me explain the project..." Memory stored as markdown in git means it's version-controlled, diffable, and human-readable. When something goes wrong, you <code>git log</code> instead of debugging a vector database.</p>
|
<p><strong>Memory.</strong> What happened last session. What decisions were made. What failed and why. Without memory, every conversation is a cold start — "let me explain the project..." Memory stored as markdown in git means it's version-controlled, diffable, and human-readable. When something goes wrong, you <code>git log</code> instead of debugging a vector database.</p>
|
||||||
<p><strong>Tools.</strong> What the agent can actually do beyond generating text. A CLI that takes screenshots, checks service health, and loads project context. API wrappers for git, CI, image generation. Without tools, the agent is a very articulate oracle that can't touch anything.</p>
|
<p><strong>Tools.</strong> What the agent can actually do beyond generating text. A CLI that takes screenshots, checks service health, and loads project context. API wrappers for git, CI, image generation. Without tools, the agent is a very articulate oracle that can't touch anything.</p>
|
||||||
<p><strong>Context.</strong> Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — <code>tinqs identity</code> — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"</p>
|
<p><strong>Context.</strong> Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — <code>tstudio identity</code> — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"</p>
|
||||||
<p><strong>Guardrails.</strong> What the agent must never do. No merging to main without review. No pushing to public repos without approval. No running destructive commands. The harness enforces these at the platform layer, not in the prompt. Prompts can be ignored. Platform gates cannot.</p>
|
<p><strong>Guardrails.</strong> What the agent must never do. No merging to main without review. No pushing to public repos without approval. No running destructive commands. The harness enforces these at the platform layer, not in the prompt. Prompts can be ignored. Platform gates cannot.</p>
|
||||||
<h2>Why generic harnesses fail for game dev</h2>
|
<h2>Why generic harnesses fail for game dev</h2>
|
||||||
<p>LangChain, CrewAI, and AutoGen are built for web apps. They assume text-in, text-out. Game development is different in ways that break those assumptions:</p>
|
<p>LangChain, CrewAI, and AutoGen are built for web apps. They assume text-in, text-out. Game development is different in ways that break those assumptions:</p>
|
||||||
@@ -281,7 +281,7 @@
|
|||||||
<p><strong>The team is small and cross-functional.</strong> Four people. No dedicated DevOps, no dedicated artist, no dedicated PM. The harness fills all those gaps, not just one.</p>
|
<p><strong>The team is small and cross-functional.</strong> Four people. No dedicated DevOps, no dedicated artist, no dedicated PM. The harness fills all those gaps, not just one.</p>
|
||||||
<h2>The toolchain that makes it work</h2>
|
<h2>The toolchain that makes it work</h2>
|
||||||
<p>Our harness runs on <a href="https://tinqs.com" style="color: var(–c-accent-l);">Tinqs Studio</a>, built on a Gitea fork with game-specific features. The key pieces:</p>
|
<p>Our harness runs on <a href="https://tinqs.com" style="color: var(–c-accent-l);">Tinqs Studio</a>, built on a Gitea fork with game-specific features. The key pieces:</p>
|
||||||
<p><strong>The CLI</strong> — a single Go binary. One command (<code>tinqs identity</code>) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.</p>
|
<p><strong>The CLI</strong> — a single Go binary. One command (<code>tstudio identity</code>) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.</p>
|
||||||
<p><strong>The soul file</strong> — a markdown document in the repo root. The agent reads it on session start. It defines values, scope, and behavioural rules. The same soul file works in Cursor, Claude Code, or any tool that reads markdown.</p>
|
<p><strong>The soul file</strong> — a markdown document in the repo root. The agent reads it on session start. It defines values, scope, and behavioural rules. The same soul file works in Cursor, Claude Code, or any tool that reads markdown.</p>
|
||||||
<p><strong>Skills</strong> — markdown playbooks for specific workflows. Image generation, concept art pipeline, 3D model creation, video generation. Each skill is a procedure the agent follows. Write once, use forever.</p>
|
<p><strong>Skills</strong> — markdown playbooks for specific workflows. Image generation, concept art pipeline, 3D model creation, video generation. Each skill is a procedure the agent follows. Write once, use forever.</p>
|
||||||
<p><strong>3D preview</strong> — click a <code>.glb</code> file in a PR and rotate the model in your browser. 22 formats supported. This alone transformed our review process — nobody approves a binary diff blind anymore.</p>
|
<p><strong>3D preview</strong> — click a <code>.glb</code> file in a PR and rotate the model in your browser. 22 formats supported. This alone transformed our review process — nobody approves a binary diff blind anymore.</p>
|
||||||
|
|||||||
|
Before
After
|
+74
-72
@@ -372,11 +372,11 @@
|
|||||||
|
|
||||||
<div class="callout">
|
<div class="callout">
|
||||||
<span class="callout__kicker">The Kitchen ↔ Flows Analogy</span>
|
<span class="callout__kicker">The Kitchen ↔ Flows Analogy</span>
|
||||||
<p><strong>The kitchen</strong> = Pi (the agent harness). <strong>The recipe</strong> = a JavaScript flow (<code>.flow.mjs</code>). <strong>The line cooks</strong> = agents (each with a station and tools). <strong>The pass</strong> = the flow engine (routes finished work). <strong>The head chef's inspection</strong> = the five gates. <strong>The order ticket</strong> = a spawn task or <code>tinqs flow run</code>. <strong>"Send it back!"</strong> = the fix loop.</p>
|
<p><strong>The kitchen</strong> = Pi (the agent harness). <strong>The recipe</strong> = a flow YAML (the DAG). <strong>The line cooks</strong> = agents (each with a station and tools). <strong>The pass</strong> = the flow engine (routes finished work). <strong>The head chef's inspection</strong> = the five gates. <strong>The order ticket</strong> = a slash command. <strong>"Send it back!"</strong> = the fix loop.</p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<h2>What Happens When You Spawn a Flow</h2>
|
<h2>What Happens When You Type a Slash Command</h2>
|
||||||
<p>You run <code>tinqs flow run game-feature --task 'add a double-jump with cooldown'</code> or click Run Flow on the dashboard. The ticket hits the kitchen. What follows is not one agent doing everything — it's a brigade running their stations.</p>
|
<p>You type <code>/game-feature add a double-jump with cooldown</code> and hit enter. The ticket hits the kitchen. What follows is not one agent doing everything — it's a brigade running their stations.</p>
|
||||||
|
|
||||||
<figure style="margin:28px 0;">
|
<figure style="margin:28px 0;">
|
||||||
<svg viewBox="0 0 920 350" role="img" aria-label="The verify-heavy flow: context, plan, implement, five gates, a Reflexion loop, and one judge" style="width:100%;height:auto;display:block;background:#0a0e14;border:1px solid #2a3340;border-radius:12px;font-family:'IBM Plex Sans',system-ui,sans-serif;">
|
<svg viewBox="0 0 920 350" role="img" aria-label="The verify-heavy flow: context, plan, implement, five gates, a Reflexion loop, and one judge" style="width:100%;height:auto;display:block;background:#0a0e14;border:1px solid #2a3340;border-radius:12px;font-family:'IBM Plex Sans',system-ui,sans-serif;">
|
||||||
@@ -457,7 +457,7 @@
|
|||||||
</div>
|
</div>
|
||||||
<div class="kitchen-col">
|
<div class="kitchen-col">
|
||||||
<span class="kitchen-col__title kitchen-col__title--reality">In the Flow</span>
|
<span class="kitchen-col__title kitchen-col__title--reality">In the Flow</span>
|
||||||
<p><span class="gate gate--visual">G5 · Visual</span> Captures 8 frames at 100ms intervals, grids them, feeds to <code>gemini-2.5-flash</code>. Checks: T-pose? Foot-slide? Frozen animation? Wrong clip? Missing transitions?</p>
|
<p><span class="gate gate--visual">G5 · Visual</span> Captures 8 frames at 100ms intervals, grids them, feeds to <code>minimax-latest</code>. Checks: T-pose? Foot-slide? Frozen animation? Wrong clip? Missing transitions?</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
@@ -467,7 +467,7 @@
|
|||||||
</div>
|
</div>
|
||||||
|
|
||||||
<h2>Composability: Adding a New Station</h2>
|
<h2>Composability: Adding a New Station</h2>
|
||||||
<p>A kitchen doesn't redesign the whole line when they add a new dish. They add a station. Same in flows. Started with three gates — build, test, vision. Behaviour and feel came later, each a single-file extension. Gates aren't hardcoded. They're sub-agents called from JavaScript flows. Want a linting gate? Add an <code>agent()</code> call with a linter. Security scan? Same pattern. Asset bundle size check? Write the tool, declare the agent, wire it in.</p>
|
<p>A kitchen doesn't redesign the whole line when they add a new dish. They add a station. Same in flows. Started with three gates — build, test, vision. Behaviour and feel came later, each a single-file extension. Gates aren't hardcoded. They're sub-agents declared in YAML. Want a linting gate? Add a sub-agent with a linter. Security scan? Same pattern. Asset bundle size check? Write the tool, declare the agent, wire it in.</p>
|
||||||
|
|
||||||
<div class="callout callout--purple">
|
<div class="callout callout--purple">
|
||||||
<span class="callout__kicker">Self-Improving Kitchen</span>
|
<span class="callout__kicker">Self-Improving Kitchen</span>
|
||||||
@@ -538,17 +538,17 @@
|
|||||||
|
|
||||||
<div class="callout callout--amber">
|
<div class="callout callout--amber">
|
||||||
<span class="callout__kicker">Flow 1 · 4 June, 18:32</span>
|
<span class="callout__kicker">Flow 1 · 4 June, 18:32</span>
|
||||||
<p><strong>deep-implement</strong> — "Build the tinqs-gitea-read extension: list_org_repos, read_repo_file, list_repo_dir, search_repos." Nine steps, 14 minutes. Verdict: <span class="gate gate--test">PASS</span>. 31/31 vitest tests green, zero new TypeScript errors, session-level caching, path traversal protection. Every <code>execute()</code> body fully wired — no stubs, no placeholders. Like a saucier who doesn't just list ingredients but actually makes the sauce.</p>
|
<p><strong>/deep-implement</strong> — "Build the tinqs-gitea-read extension: list_org_repos, read_repo_file, list_repo_dir, search_repos." Nine steps, 14 minutes. Verdict: <span class="gate gate--test">PASS</span>. 31/31 vitest tests green, zero new TypeScript errors, session-level caching, path traversal protection. Every <code>execute()</code> body fully wired — no stubs, no placeholders. Like a saucier who doesn't just list ingredients but actually makes the sauce.</p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<div class="callout callout--purple">
|
<div class="callout callout--purple">
|
||||||
<span class="callout__kicker">Flow 2 · 4 June, 19:04</span>
|
<span class="callout__kicker">Flow 2 · 4 June, 19:04</span>
|
||||||
<p><strong>game-feature</strong> — "Make the player jump." Build: <span class="gate gate--build">PASS</span>. Tests: <span class="gate gate--test">PASS</span>. Behaviour/Feel/Visual: <span style="color:#f59e0b;">NOT RUN</span> — no live game instance was reachable. The flow didn't silently skip the visual gate. It <strong>hard-stopped</strong> and reported honestly: "FAIL — the feature has not been verified in-game." This is the kitchen saying: "The dish is cooked, but nobody tasted it. I'm not sending it out."</p>
|
<p><strong>/game-feature</strong> — "Make the player jump." Build: <span class="gate gate--build">PASS</span>. Tests: <span class="gate gate--test">PASS</span>. Behaviour/Feel/Visual: <span style="color:#f59e0b;">NOT RUN</span> — no live game instance was reachable. The flow didn't silently skip the visual gate. It <strong>hard-stopped</strong> and reported honestly: "FAIL — the feature has not been verified in-game." This is the kitchen saying: "The dish is cooked, but nobody tasted it. I'm not sending it out."</p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<div class="callout callout--amber">
|
<div class="callout callout--amber">
|
||||||
<span class="callout__kicker">Flow 3 · 4 June, 19:49</span>
|
<span class="callout__kicker">Flow 3 · 4 June, 19:49</span>
|
||||||
<p><strong>cto-infra</strong> — "Synthesize cost, stability, and VCS research into an AWS architecture decision." Four research streams fed into one CTO agent. Output: 14 requirements mapped to specific decisions, cost-vs-stability tradeoffs resolved with dollar figures, EC2+EBS over Fargate+EFS, RDS Multi-AZ mandatory, S3+CloudFront for LFS. Like an executive chef reading four menu proposals, reconciling them into one service, and pricing every plate.</p>
|
<p><strong>/cto-infra</strong> — "Synthesize cost, stability, and VCS research into an AWS architecture decision." Four research streams fed into one CTO agent. Output: 14 requirements mapped to specific decisions, cost-vs-stability tradeoffs resolved with dollar figures, EC2+EBS over Fargate+EFS, RDS Multi-AZ mandatory, S3+CloudFront for LFS. Like an executive chef reading four menu proposals, reconciling them into one service, and pricing every plate.</p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<hr class="accent">
|
<hr class="accent">
|
||||||
@@ -556,95 +556,97 @@
|
|||||||
<h2>Dinner Rush Recovery: The Crash That Interrupted Service</h2>
|
<h2>Dinner Rush Recovery: The Crash That Interrupted Service</h2>
|
||||||
<p>Earlier today, a machine crash cut off a flow mid-stream — the kitchen lost power during dinner rush. Nineteen tests were left red. Contracts written, implementation half-done. Half-cooked dishes on every station.</p>
|
<p>Earlier today, a machine crash cut off a flow mid-stream — the kitchen lost power during dinner rush. Nineteen tests were left red. Contracts written, implementation half-done. Half-cooked dishes on every station.</p>
|
||||||
|
|
||||||
<p>I spawned the same flow with a different task:</p>
|
<p>I typed one slash command — the expediter reassembled the brigade:</p>
|
||||||
|
|
||||||
<pre><code>tinqs flow run game-feature --task 'Finish the leftover jump & locomotion animation work -- make the 19 FAILING tests GREEN.'</code></pre>
|
<pre><code>/game-feature Finish the leftover jump & locomotion animation work — make the 19 FAILING tests GREEN.</code></pre>
|
||||||
|
|
||||||
<p>What happened next: the team picked up exactly where the crash left off. Here's the recipe — the exact JavaScript that runs in production:</p>
|
<p>What happened next: the team picked up exactly where the crash left off. Here's the recipe — the exact YAML that runs in production:</p>
|
||||||
|
|
||||||
<pre><code>// .pi/flows/flows/game-feature.flow.mjs
|
<pre><code>name: game-feature
|
||||||
export const meta = {
|
description: Build a PLAYABLE game feature and prove it in the LIVE game.
|
||||||
name: "game-feature",
|
|
||||||
description: "Build a PLAYABLE game feature and prove it in the LIVE game.",
|
|
||||||
task_required: true
|
task_required: true
|
||||||
};
|
|
||||||
|
|
||||||
export default async function run({ task, flow }) {
|
steps:
|
||||||
// G0: Pre-flight — validate vision CAN run before any build work
|
# G0: Pre-flight — validate vision CAN run before any build work
|
||||||
await flow.agent("vision-preflight", {
|
- id: preflight
|
||||||
task: "Check GEMINI_API_KEY is set AND game_frames reaches a live instance."
|
agent: vision-preflight
|
||||||
});
|
task: Check MINIMAX_API_KEY is set AND game_frames reaches a live instance.
|
||||||
|
If EITHER fails, STOP — vision is not optional.
|
||||||
|
|
||||||
// Context + plan
|
# Context + plan
|
||||||
const context = await flow.agent("project-context-reader");
|
- id: context
|
||||||
const plan = await flow.agent("feature-planner", { context });
|
agent: project-context-reader
|
||||||
|
blockedBy: [preflight]
|
||||||
|
|
||||||
// TDD: write tests FIRST (different agent than implementer)
|
- id: plan
|
||||||
const testSuite = await flow.agent("test-author", { plan });
|
agent: feature-planner
|
||||||
|
blockedBy: [context]
|
||||||
|
|
||||||
// Implement
|
# TDD: write tests FIRST (different agent than implementer)
|
||||||
const source = await flow.agent("game-builder", { testSuite, plan });
|
- id: test-author
|
||||||
|
agent: test-author
|
||||||
|
blockedBy: [plan]
|
||||||
|
|
||||||
// G1–G5: Oracle gates run via parallel for speed
|
- id: implement
|
||||||
const gates = await flow.parallel([
|
agent: game-builder
|
||||||
flow.agent("build-verifier", { source }),
|
blockedBy: [test-author]
|
||||||
flow.agent("test-runner", { source }),
|
|
||||||
flow.agent("behavioral-prober", { source }),
|
|
||||||
flow.agent("feel-judge", { source }),
|
|
||||||
flow.agent("animation-vision-judge", { source })
|
|
||||||
]);
|
|
||||||
|
|
||||||
// Self-recurring fix-loop: bounded loop back to implement with evidence
|
# G1–G5: Oracle gates (build, tests, behaviour, feel, visual)
|
||||||
const MAX_RETRIES = 3;
|
- id: build → agent: build-verifier
|
||||||
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
|
- id: tests → agent: test-runner
|
||||||
const decision = await flow.agent("flow-decision", { gates });
|
- id: behavior → agent: behavioral-prober (drives LIVE game via drive_game)
|
||||||
if (decision.verdict === "pass") break;
|
- id: feel → agent: feel-judge (apex, airtime, latency, rise/fall)
|
||||||
if (attempt === MAX_RETRIES) {
|
- id: visual → agent: animation-vision-judge (multimodal minimax-latest)
|
||||||
const fixed = await flow.agent("game-builder", { source, failures: decision.evidence });
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Final judge: one honest verdict
|
# Self-recurring fix-loop: bounded loop back to implement with evidence
|
||||||
return flow.agent("game-judge");
|
- id: fix-loop
|
||||||
}</code></pre>
|
type: agent-loop-decision
|
||||||
|
agent: flow-decision
|
||||||
|
loop_target: implement
|
||||||
|
exit_target: report
|
||||||
|
max_iterations: 3
|
||||||
|
|
||||||
<p>Eight logical steps, seven cooks, five inspection points, one head chef. Triggered by a single spawn.</p>
|
# Final judge: one honest verdict
|
||||||
|
- id: report
|
||||||
|
agent: game-judge</code></pre>
|
||||||
|
|
||||||
<p>Here's how the brigade actually worked. The <strong>vision-preflight</strong> agent — the chef who checks the gas is on before anyone starts cooking — verified <code>GEMINI_API_KEY</code> was set and <code>game_frames</code> could reach the live game. Both green in under a second. Without this, the whole kitchen would prep for an hour only to discover the oven doesn't work.</p>
|
<p>Eighteen steps, seven cooks, five inspection points, one head chef. Triggered by a single order ticket.</p>
|
||||||
|
|
||||||
|
<p>Here's how the brigade actually worked. The <strong>vision-preflight</strong> agent — the chef who checks the gas is on before anyone starts cooking — verified <code>MINIMAX_API_KEY</code> was set and <code>game_frames</code> could reach the live game. Both green in under a second. Without this, the whole kitchen would prep for an hour only to discover the oven doesn't work.</p>
|
||||||
|
|
||||||
<p>The <strong>project-context-reader</strong> — the commis who reads the entire recipe book — ingested <code>PlayerController.cs</code>, <code>PlayerAnimController.cs</code>, <code>PlayerAnimationLogic.cs</code>, the test files, the manifest. The <strong>feature-planner</strong> — the sous-chef who breaks down the order into station tasks — decomposed 19 failures into four fix groups: vegetation manifest (146 broken <code>prefabPath</code> items), animation controller (crouch parameter not plumbed), jump physics (coyote time, variable height, air control — all missing), and animation tree (entire state machine absent).</p>
|
<p>The <strong>project-context-reader</strong> — the commis who reads the entire recipe book — ingested <code>PlayerController.cs</code>, <code>PlayerAnimController.cs</code>, <code>PlayerAnimationLogic.cs</code>, the test files, the manifest. The <strong>feature-planner</strong> — the sous-chef who breaks down the order into station tasks — decomposed 19 failures into four fix groups: vegetation manifest (146 broken <code>prefabPath</code> items), animation controller (crouch parameter not plumbed), jump physics (coyote time, variable height, air control — all missing), and animation tree (entire state machine absent).</p>
|
||||||
|
|
||||||
<p>Then the <strong>game-builder</strong> — the line cook at the hot station — read each test failure like a dish ticket, traced it to the source, and started cooking. Coyote time: 100ms grace period after feet leave the ground. Variable jump height: velocity scaled by hold duration, tap gives 3.5, full hold gives 6.5. Air control: horizontal speed cut 40% while airborne. Jump phases: minimum 0.15s on jump_start before transitioning up. Landing timer: wait the full animation length, not length-minus-blend. Animation tree: <code>jump_start → jump → jump_land</code> states with 0.1s blends.</p>
|
<p>Then the <strong>game-builder</strong> — the line cook at the hot station — read each test failure like a dish ticket, traced it to the source, and started cooking. Coyote time: 100ms grace period after feet leave the ground. Variable jump height: velocity scaled by hold duration, tap gives 3.5, full hold gives 6.5. Air control: horizontal speed cut 40% while airborne. Jump phases: minimum 0.15s on jump_start before transitioning up. Landing timer: wait the full animation length, not length-minus-blend. Animation tree: <code>jump_start → jump → jump_land</code> states with 0.1s blends.</p>
|
||||||
|
|
||||||
<p>Then the inspection line: <strong>build-verifier</strong> compiled. <strong>Test-runner</strong> ran the suite. <strong>Behavioral-prober</strong> sent <code>{"jump":true}</code> to the live game and sampled the player body. <strong>Feel-judge</strong> measured apex, airtime, liftoff latency. <strong>Animation-vision-judge</strong> captured 8 frames, gridded them, had <code>gemini-2.5-flash</code> scan for T-poses and foot-slide.</p>
|
<p>Then the inspection line: <strong>build-verifier</strong> compiled. <strong>Test-runner</strong> ran the suite. <strong>Behavioral-prober</strong> sent <code>{"jump":true}</code> to the live game and sampled the player body. <strong>Feel-judge</strong> measured apex, airtime, liftoff latency. <strong>Animation-vision-judge</strong> captured 8 frames, gridded them, had <code>minimax-latest</code> scan for T-poses and foot-slide.</p>
|
||||||
|
|
||||||
<p>Anything red → ticket back to the cook with the specific failure → fix → re-enter the line. Bounded to 3 returns. Anything green → falls through. All green → <strong>game-judge</strong> gives the final verdict.</p>
|
<p>Anything red → ticket back to the cook with the specific failure → fix → re-enter the line. Bounded to 3 returns. Anything green → falls through. All green → <strong>game-judge</strong> gives the final verdict.</p>
|
||||||
|
|
||||||
<div class="callout">
|
<div class="callout">
|
||||||
<span class="callout__kicker">Not a Demo</span>
|
<span class="callout__kicker">Not a Demo</span>
|
||||||
<p>This flow is a file at <code>.pi/flows/flows/game-feature.flow.mjs</code>. I trigger it by running <code>tinqs flow run game-feature</code> or clicking Run Flow on the dashboard. It dispatches agents, runs gates, loops on failures, reports a verdict. The dashboard at <code>:33634</code> is the control plane — spawn, steer mid-run, inspect state. That's the whole product.</p>
|
<p>This flow is a file at <code>.pi/flows/flows/game-feature.yaml</code>. I trigger it by typing <code>/game-feature</code> in Pi. It dispatches agents, runs gates, loops on failures, reports a verdict. There is no dashboard with drag-and-drop. There is a YAML file and a slash command. That's the whole product.</p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<hr class="accent">
|
<hr class="accent">
|
||||||
|
|
||||||
<h2>The Menu: Flows at Your Fingertips</h2>
|
<h2>The Menu: Flows Are Slash Commands</h2>
|
||||||
<p>Every flow lives in <code>.pi/flows/flows/*.flow.mjs</code> and is spawnable by name. You run <code>tinqs flow run <name> [task]</code> or click Run Flow on the dashboard.</p>
|
<p>Every flow becomes a slash command — the menu you read to the expediter. <code>.pi/flows/flows/game-feature.yaml</code> → <code>/game-feature</code>. You don't invoke a pipeline from a terminal. You order a dish in conversation.</p>
|
||||||
|
|
||||||
<p>"Add wall-running" becomes the task argument. The flow reads it, wires it through the agents, routes it through the gates. The JavaScript is the recipe. The conversation provides the context.</p>
|
<p>"Add wall-running" is not a CLI flag. It's natural language. The flow reads it, wires it through the agents, routes it through the gates. The YAML is the recipe. The conversation is the context.</p>
|
||||||
|
|
||||||
<p>The menu I call from daily:</p>
|
<p>The menu I call from daily:</p>
|
||||||
|
|
||||||
<ul>
|
<ul>
|
||||||
<li><strong>game-feature</strong> — "add a double-jump" or "fix the 19 red tests" → brigade assembles, cooks, inspects, plates</li>
|
<li><strong>/game-feature</strong> — "add a double-jump" or "fix the 19 red tests" → brigade assembles, cooks, inspects, plates</li>
|
||||||
<li><strong>deep-implement</strong> — "build the gitea-read extension" → research → plan → implement → test → review → judge</li>
|
<li><strong>/deep-implement</strong> — "build the gitea-read extension" → research → plan → implement → test → review → judge</li>
|
||||||
<li><strong>cto-infra</strong> — "reconcile cost, stability, and VCS research into architecture decisions" → 4 research streams → 1 synthesis agent → 14 requirements mapped to decisions</li>
|
<li><strong>/cto-infra</strong> — "reconcile cost, stability, and VCS research into architecture decisions" → 4 research streams → 1 synthesis agent → 14 requirements mapped to decisions</li>
|
||||||
<li><strong>flows:new</strong> — "I need a flow that..." → the Flow Architect reads the agent catalog, selects cooks, designs the recipe, writes the <code>.flow.mjs</code></li>
|
<li><strong>/flows:new</strong> — "I need a flow that..." → the Flow Architect reads the agent catalog, selects cooks, designs the recipe, writes the YAML</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
<h2>The Pass: How Agents Hand Off Work</h2>
|
<h2>The Pass: How Agents Hand Off Work</h2>
|
||||||
<p>In a real kitchen, cooks don't shout instructions across the room. They place finished plates on the pass. The expediter reads the ticket, checks the plate, routes it to the next station or to the dining room. Nobody yells. Nobody grabs someone else's pan.</p>
|
<p>In a real kitchen, cooks don't shout instructions across the room. They place finished plates on the pass. The expediter reads the ticket, checks the plate, routes it to the next station or to the dining room. Nobody yells. Nobody grabs someone else's pan.</p>
|
||||||
|
|
||||||
<p>Flows work the same way. Agents never talk to each other directly. When the game-builder finishes, it returns a result object — placing its work on the pass. The flow engine — the expediter — records it and routes it. The next agent receives the return value directly from <code>await flow.agent("game-builder")</code>.</p>
|
<p>Flows work the same way. Agents never talk to each other directly. When the game-builder finishes, it doesn't ping the test-runner. It calls <code>finish({ summary: "...", artifacts: "...", files: "..." })</code> — placing its work on the pass. The flow engine — the expediter — records it and routes it. The next agent receives exactly the inputs wired in the YAML: <code>${{result.game-builder.summary}}</code>, <code>${{result.game-builder.files}}</code>.</p>
|
||||||
|
|
||||||
<div class="kitchen-grid">
|
<div class="kitchen-grid">
|
||||||
<div class="kitchen-col">
|
<div class="kitchen-col">
|
||||||
@@ -653,11 +655,11 @@ export default async function run({ task, flow }) {
|
|||||||
</div>
|
</div>
|
||||||
<div class="kitchen-col">
|
<div class="kitchen-col">
|
||||||
<span class="kitchen-col__title kitchen-col__title--reality">What Actually Happens</span>
|
<span class="kitchen-col__title kitchen-col__title--reality">What Actually Happens</span>
|
||||||
<p>Agent A returns <code>{ verdict: "pass", findings: ["coyote_time=100ms"] }</code> → flow engine records it → Agent B receives the result as a direct return value of <code>await flow.agent("A")</code>. No chatter. Structured handoff.</p>
|
<p>Agent A → <code>finish({verdict: "pass", findings: ["coyote_time=100ms"]})</code> → engine records → Agent B receives <code>${{result.A.findings}}</code> via <code>inputs:</code> block. No chatter. Structured handoff.</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<p>Why? Because unstructured chatter is how hallucination cascades start. Agent A confidently states something wrong. Agent B builds on it. Agent C compounds it. Three agents later, they're collectively wrong about a file that doesn't exist, and nobody can trace where the error came from. The pass — structured result-passing via typed return values from each <code>agent()</code> call — makes every handoff auditable, verifiable, and debuggable.</p>
|
<p>Why? Because unstructured chatter is how hallucination cascades start. Agent A confidently states something wrong. Agent B builds on it. Agent C compounds it. Three agents later, they're collectively wrong about a file that doesn't exist, and nobody can trace where the error came from. The pass — structured result-passing with typed outputs — makes every handoff auditable, verifiable, and debuggable.</p>
|
||||||
|
|
||||||
<p>Pi itself is built for solo interactive work: you ask, it does, you review. The orchestration layer I wrote on top inverts that. Pi becomes the kitchen. The flow engine becomes the expediter. Agents become line cooks who place plates on the pass, never shouting across the room.</p>
|
<p>Pi itself is built for solo interactive work: you ask, it does, you review. The orchestration layer I wrote on top inverts that. Pi becomes the kitchen. The flow engine becomes the expediter. Agents become line cooks who place plates on the pass, never shouting across the room.</p>
|
||||||
|
|
||||||
@@ -698,17 +700,17 @@ outputs: [summary, files]
|
|||||||
You are a game developer. Task: ${{task}}
|
You are a game developer. Task: ${{task}}
|
||||||
Context: ${{input.context}}</code></pre>
|
Context: ${{input.context}}</code></pre>
|
||||||
|
|
||||||
<p><strong style="color:#f59e0b;">Flows</strong> are JavaScript modules (<code>.flow.mjs</code>) that coordinate agents with real control flow. I have about <strong>15–20 flows</strong> running across different domains:</p>
|
<p><strong style="color:#f59e0b;">Flows</strong> are YAML DAGs that wire agents together. I have about <strong>15–20 flows</strong> running across different domains:</p>
|
||||||
|
|
||||||
<ul>
|
<ul>
|
||||||
<li><strong>Game dev:</strong> game-feature, review, bug-hunt, refactor</li>
|
<li><strong>Game dev:</strong> /game-feature, /review, /bug-hunt, /refactor</li>
|
||||||
<li><strong>Design:</strong> concept-art, sound-design (plans → ElevenLabs generation → judge evaluates with other models)</li>
|
<li><strong>Design:</strong> /concept-art, /sound-design (plans → ElevenLabs generation → judge evaluates with other models)</li>
|
||||||
<li><strong>Marketing:</strong> brand-image, trailer-clip (Sora 2 video generation → vision judge)</li>
|
<li><strong>Marketing:</strong> /brand-image, /trailer-clip (Sora 2 video generation → vision judge)</li>
|
||||||
<li><strong>Infra:</strong> ci-fix, deploy-check, tinqs-jobs (action runners on AWS Lambda, workspace management)</li>
|
<li><strong>Infra:</strong> /ci-fix, /deploy-check, /tstudio-jobs (action runners on AWS Lambda, workspace management)</li>
|
||||||
<li><strong>Meta:</strong> A flow that periodically reads and improves the other flows — yes, flows that edit flows</li>
|
<li><strong>Meta:</strong> A flow that periodically reads and improves the other flows — yes, flows that edit flows</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
<p>The setup is not a product you install. It's a stack: Pi as the agent harness, custom extensions as the tool layer, markdown agents as the role layer, JavaScript flows as the orchestration layer. The whole thing lives in <code>.pi/flows/</code>. Version-controlled. CI-tested. Spawned via <code>tinqs flow run</code> or the dashboard.</p>
|
<p>The setup is not a product you install. It's a stack: Pi as the agent harness, custom extensions as the tool layer, markdown agents as the role layer, YAML flows as the orchestration layer. The whole thing lives in <code>.pi/flows/</code>. Version-controlled. CI-tested. Slash-command invoked.</p>
|
||||||
|
|
||||||
<h2>The Recipe vs. The Technique</h2>
|
<h2>The Recipe vs. The Technique</h2>
|
||||||
<p>"Do you define the process with these trees, or do the agents freestyle?" Both. The recipe says what to make and in what order. The technique is how each cook executes their station.</p>
|
<p>"Do you define the process with these trees, or do the agents freestyle?" Both. The recipe says what to make and in what order. The technique is how each cook executes their station.</p>
|
||||||
@@ -716,7 +718,7 @@ Context: ${{input.context}}</code></pre>
|
|||||||
<div class="kitchen-grid">
|
<div class="kitchen-grid">
|
||||||
<div class="kitchen-col">
|
<div class="kitchen-col">
|
||||||
<span class="kitchen-col__title kitchen-col__title--kitchen">The Recipe (Rigid)</span>
|
<span class="kitchen-col__title kitchen-col__title--kitchen">The Recipe (Rigid)</span>
|
||||||
<p>The flow's JavaScript is the recipe. It says: first the prep cook dices onions, then the saucier makes the base, then the grill cook sears the protein. After every station, the plate hits the pass for inspection. <strong>This order is not negotiable.</strong> A cook cannot skip the inspection because they feel confident. The inspection runs. Period.</p>
|
<p>The flow YAML is the recipe. It says: first the prep cook dices onions, then the saucier makes the base, then the grill cook sears the protein. After every station, the plate hits the pass for inspection. <strong>This order is not negotiable.</strong> A cook cannot skip the inspection because they feel confident. The inspection runs. Period.</p>
|
||||||
</div>
|
</div>
|
||||||
<div class="kitchen-col">
|
<div class="kitchen-col">
|
||||||
<span class="kitchen-col__title kitchen-col__title--reality">The Technique (Autonomous)</span>
|
<span class="kitchen-col__title kitchen-col__title--reality">The Technique (Autonomous)</span>
|
||||||
@@ -728,7 +730,7 @@ Context: ${{input.context}}</code></pre>
|
|||||||
|
|
||||||
<div class="callout callout--purple">
|
<div class="callout callout--purple">
|
||||||
<span class="callout__kicker">The Meta-Kitchen</span>
|
<span class="callout__kicker">The Meta-Kitchen</span>
|
||||||
<p>And when a recipe is wrong? Another flow improves it. A meta-flow reads performance data, spots bottlenecks — "the feel gate keeps failing because the cook doesn't know the jump velocity threshold" — edits the <code>.flow.mjs</code> to pass that threshold into the builder's inputs, and commits the change. <strong>Flows that edit flows.</strong> The kitchen that renovates itself between services.</p>
|
<p>And when a recipe is wrong? Another flow improves it. A meta-flow reads performance data, spots bottlenecks — "the feel gate keeps failing because the cook doesn't know the jump velocity threshold" — edits the YAML to wire that threshold into the builder's inputs, and commits the change. <strong>Flows that edit flows.</strong> The kitchen that renovates itself between services.</p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<hr class="accent">
|
<hr class="accent">
|
||||||
@@ -749,14 +751,14 @@ Context: ${{input.context}}</code></pre>
|
|||||||
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:7px 12px;color:#e6edf3;"><code>@planning</code></td><td style="padding:7px 12px;color:#f59e0b;">DeepSeek V4</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>Boning knife</strong> — precision decomposition. Breaks tasks into steps, designs DAGs. Flow architect, feature planner.</td></tr>
|
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:7px 12px;color:#e6edf3;"><code>@planning</code></td><td style="padding:7px 12px;color:#f59e0b;">DeepSeek V4</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>Boning knife</strong> — precision decomposition. Breaks tasks into steps, designs DAGs. Flow architect, feature planner.</td></tr>
|
||||||
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:7px 12px;color:#e6edf3;"><code>@fast</code></td><td style="padding:7px 12px;color:#38bdf8;">DeepSeek V4 Flash</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>Paring knife</strong> — quick, decisive cuts. Gate pass/fail, fork choices, loop exits. No overthinking.</td></tr>
|
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:7px 12px;color:#e6edf3;"><code>@fast</code></td><td style="padding:7px 12px;color:#38bdf8;">DeepSeek V4 Flash</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>Paring knife</strong> — quick, decisive cuts. Gate pass/fail, fork choices, loop exits. No overthinking.</td></tr>
|
||||||
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:7px 12px;color:#e6edf3;"><code>@research</code></td><td style="padding:7px 12px;color:#f59e0b;">DeepSeek V4</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>Fillet knife</strong> — flexible, follows contours. Reads codebase, traces patterns, finds what matters.</td></tr>
|
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:7px 12px;color:#e6edf3;"><code>@research</code></td><td style="padding:7px 12px;color:#f59e0b;">DeepSeek V4</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>Fillet knife</strong> — flexible, follows contours. Reads codebase, traces patterns, finds what matters.</td></tr>
|
||||||
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:7px 12px;color:#e6edf3;"><code>@vision</code></td><td style="padding:7px 12px;color:#a855f7;">Gemini 2.5 Flash</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>The inspector's eyes</strong> — the only knife that sees. Multimodal frame judging: T-poses, foot-slide, frozen anims.</td></tr>
|
<tr style="border-bottom:1px solid #1c2230;"><td style="padding:7px 12px;color:#e6edf3;"><code>@vision</code></td><td style="padding:7px 12px;color:#a855f7;">MiniMax latest</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>The inspector's eyes</strong> — the only knife that sees. Multimodal frame judging: T-poses, foot-slide, frozen anims.</td></tr>
|
||||||
<tr><td style="padding:7px 12px;color:#e6edf3;"><code>@compact</code></td><td style="padding:7px 12px;color:#38bdf8;">DeepSeek V4 Flash</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>Kitchen shears</strong> — lightweight, versatile. Summaries, verdicts, post-processing. Fast and cheap.</td></tr>
|
<tr><td style="padding:7px 12px;color:#e6edf3;"><code>@compact</code></td><td style="padding:7px 12px;color:#38bdf8;">DeepSeek V4 Flash</td><td style="padding:7px 12px;color:#cdd7e2;"><strong>Kitchen shears</strong> — lightweight, versatile. Summaries, verdicts, post-processing. Fast and cheap.</td></tr>
|
||||||
</tbody>
|
</tbody>
|
||||||
</table>
|
</table>
|
||||||
|
|
||||||
<div class="callout callout--amber">
|
<div class="callout callout--amber">
|
||||||
<span class="callout__kicker">Why DeepSeek?</span>
|
<span class="callout__kicker">Why DeepSeek?</span>
|
||||||
<p>Two reasons. <strong>It's free</strong> — no usage limits, which matters when your game-builder reads 800-line files and writes 200-line diffs ten times a session. <strong>It's genuinely good at C# and Godot</strong> — I've had it write a full lighting module for our Godot fork by reading Unity API docs and adapting patterns. No agent had pulled that off before. DeepSeek can't do multimodal, so vision goes to Gemini — but for everything else, it's the chef's knife you reach for 90% of the time.</p>
|
<p>Two reasons. <strong>It's free</strong> — no usage limits, which matters when your game-builder reads 800-line files and writes 200-line diffs ten times a session. <strong>It's genuinely good at C# and Godot</strong> — I've had it write a full lighting module for our Godot fork by reading Unity API docs and adapting patterns. No agent had pulled that off before. DeepSeek V4 now has multimodal, but for vision we use MiniMax latest — it's sharper at frame-by-frame animation judging and costs less per image. For everything else, DeepSeek is the chef's knife you reach for 90% of the time.</p>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<p>The point of the knife rack: you configure this <strong>once</strong>. Every agent declares <code>model: @coding</code> and gets DeepSeek V4 automatically. Swap models globally without touching any flow or agent file. The right blade, every time, no thinking required.</p>
|
<p>The point of the knife rack: you configure this <strong>once</strong>. Every agent declares <code>model: @coding</code> and gets DeepSeek V4 automatically. Swap models globally without touching any flow or agent file. The right blade, every time, no thinking required.</p>
|
||||||
|
|||||||
|
Before
After
|
@@ -24,7 +24,7 @@ Every agent harness, regardless of domain, needs five things:
|
|||||||
|
|
||||||
**Tools.** What the agent can actually do beyond generating text. A CLI that takes screenshots, checks service health, and loads project context. API wrappers for git, CI, image generation. Without tools, the agent is a very articulate oracle that can't touch anything.
|
**Tools.** What the agent can actually do beyond generating text. A CLI that takes screenshots, checks service health, and loads project context. API wrappers for git, CI, image generation. Without tools, the agent is a very articulate oracle that can't touch anything.
|
||||||
|
|
||||||
**Context.** Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — `tinqs identity` — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"
|
**Context.** Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — `tstudio identity` — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"
|
||||||
|
|
||||||
**Guardrails.** What the agent must never do. No merging to main without review. No pushing to public repos without approval. No running destructive commands. The harness enforces these at the platform layer, not in the prompt. Prompts can be ignored. Platform gates cannot.
|
**Guardrails.** What the agent must never do. No merging to main without review. No pushing to public repos without approval. No running destructive commands. The harness enforces these at the platform layer, not in the prompt. Prompts can be ignored. Platform gates cannot.
|
||||||
|
|
||||||
@@ -44,7 +44,7 @@ LangChain, CrewAI, and AutoGen are built for web apps. They assume text-in, text
|
|||||||
|
|
||||||
Our harness runs on [Tinqs Studio](https://tinqs.com), built on a Gitea fork with game-specific features. The key pieces:
|
Our harness runs on [Tinqs Studio](https://tinqs.com), built on a Gitea fork with game-specific features. The key pieces:
|
||||||
|
|
||||||
**The CLI** — a single Go binary. One command (`tinqs identity`) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.
|
**The CLI** — a single Go binary. One command (`tstudio identity`) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.
|
||||||
|
|
||||||
**The soul file** — a markdown document in the repo root. The agent reads it on session start. It defines values, scope, and behavioural rules. The same soul file works in Cursor, Claude Code, or any tool that reads markdown.
|
**The soul file** — a markdown document in the repo root. The agent reads it on session start. It defines values, scope, and behavioural rules. The same soul file works in Cursor, Claude Code, or any tool that reads markdown.
|
||||||
|
|
||||||
|
|||||||
+5
-5
@@ -12,11 +12,11 @@ author_role: "CTO & Developer, Tinqs"
|
|||||||
---
|
---
|
||||||
Every AI agent session starts the same way: cold. The agent doesn't know what project this is, who's asking, what tools are available, or what happened yesterday. You spend the first five minutes re-explaining context.
|
Every AI agent session starts the same way: cold. The agent doesn't know what project this is, who's asking, what tools are available, or what happened yesterday. You spend the first five minutes re-explaining context.
|
||||||
|
|
||||||
Our CLI solves this in 100ms. One command — `tinqs identity` — and the agent knows everything. The binary is 15MB, has zero runtime dependencies, and runs on every machine in the studio.
|
Our CLI solves this in 100ms. One command — `tstudio identity` — and the agent knows everything. The binary is 15MB, has zero runtime dependencies, and runs on every machine in the studio.
|
||||||
|
|
||||||
## The identity command (100ms)
|
## The identity command (100ms)
|
||||||
|
|
||||||
When an agent starts, the first thing it calls is `tinqs identity`. The output:
|
When an agent starts, the first thing it calls is `tstudio identity`. The output:
|
||||||
|
|
||||||
- **Soul file** — the agent's persistent identity, values, operating principles
|
- **Soul file** — the agent's persistent identity, values, operating principles
|
||||||
- **Company context** — team members, roles, what the company does
|
- **Company context** — team members, roles, what the company does
|
||||||
@@ -26,7 +26,7 @@ When an agent starts, the first thing it calls is `tinqs identity`. The output:
|
|||||||
|
|
||||||
This data lives in markdown files in the docs repo. Any machine on the network can read it. The agent goes from blank to fully contextual in under a second.
|
This data lives in markdown files in the docs repo. Any machine on the network can read it. The agent goes from blank to fully contextual in under a second.
|
||||||
|
|
||||||
This started as a convenience tool for humans. It became the single most important function in our stack. Every agent session — Cursor, Claude Code, Pi — starts with `tinqs identity`. Without it, every conversation begins with "let me explain the project." With it, the agent already knows.
|
This started as a convenience tool for humans. It became the single most important function in our stack. Every agent session — Cursor, Claude Code, Pi — starts with `tstudio identity`. Without it, every conversation begins with "let me explain the project." With it, the agent already knows.
|
||||||
|
|
||||||
## Screenshots and cloud vision
|
## Screenshots and cloud vision
|
||||||
|
|
||||||
@@ -38,7 +38,7 @@ This is how you file bugs without typing. Look at the game, tell the agent what'
|
|||||||
|
|
||||||
## Health checks
|
## Health checks
|
||||||
|
|
||||||
`tinqs doctor` runs a comprehensive check:
|
`tstudio doctor` runs a comprehensive check:
|
||||||
|
|
||||||
- Is the git platform reachable and authenticated?
|
- Is the git platform reachable and authenticated?
|
||||||
- Is the game server running?
|
- Is the game server running?
|
||||||
@@ -55,7 +55,7 @@ Cross-compilation is trivial. We build Windows, Mac (arm64 + amd64), and Linux b
|
|||||||
|
|
||||||
## What we learned
|
## What we learned
|
||||||
|
|
||||||
**The CLI is the API for AI agents.** What started as a human convenience tool became the primary interface for agents. Every session starts with `tinqs identity`. The agent's "hands and eyes" — screenshots, vision, health checks — are subcommands of the same binary.
|
**The CLI is the API for AI agents.** What started as a human convenience tool became the primary interface for agents. Every session starts with `tstudio identity`. The agent's "hands and eyes" — screenshots, vision, health checks — are subcommands of the same binary.
|
||||||
|
|
||||||
**One binary beats ten scripts.** Scripts rot. They have different shells, different PATH assumptions, different error handling. A compiled binary either works or it doesn't. It ships with dependencies baked in. It doesn't care if your Python is 3.9 or 3.12.
|
**One binary beats ten scripts.** Scripts rot. They have different shells, different PATH assumptions, different error handling. A compiled binary either works or it doesn't. It ships with dependencies baked in. It doesn't care if your Python is 3.9 or 3.12.
|
||||||
|
|
||||||
|
|||||||
+5
-5
@@ -265,9 +265,9 @@
|
|||||||
<p class="post__lead">Every AI agent session starts the same way: cold. The agent doesn't know what project this is, who's asking, what tools are available, or what happened yesterday. You spend the first five minutes re-explaining context.</p>
|
<p class="post__lead">Every AI agent session starts the same way: cold. The agent doesn't know what project this is, who's asking, what tools are available, or what happened yesterday. You spend the first five minutes re-explaining context.</p>
|
||||||
|
|
||||||
<div class="post__body">
|
<div class="post__body">
|
||||||
<p>Our CLI solves this in 100ms. One command — <code>tinqs identity</code> — and the agent knows everything. The binary is 15MB, has zero runtime dependencies, and runs on every machine in the studio.</p>
|
<p>Our CLI solves this in 100ms. One command — <code>tstudio identity</code> — and the agent knows everything. The binary is 15MB, has zero runtime dependencies, and runs on every machine in the studio.</p>
|
||||||
<h2>The identity command (100ms)</h2>
|
<h2>The identity command (100ms)</h2>
|
||||||
<p>When an agent starts, the first thing it calls is <code>tinqs identity</code>. The output:</p>
|
<p>When an agent starts, the first thing it calls is <code>tstudio identity</code>. The output:</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li><strong>Soul file</strong> — the agent's persistent identity, values, operating principles</li>
|
<li><strong>Soul file</strong> — the agent's persistent identity, values, operating principles</li>
|
||||||
<li><strong>Company context</strong> — team members, roles, what the company does</li>
|
<li><strong>Company context</strong> — team members, roles, what the company does</li>
|
||||||
@@ -276,13 +276,13 @@
|
|||||||
<li><strong>Service status</strong> — which URLs are live and reachable</li>
|
<li><strong>Service status</strong> — which URLs are live and reachable</li>
|
||||||
</ul>
|
</ul>
|
||||||
<p>This data lives in markdown files in the docs repo. Any machine on the network can read it. The agent goes from blank to fully contextual in under a second.</p>
|
<p>This data lives in markdown files in the docs repo. Any machine on the network can read it. The agent goes from blank to fully contextual in under a second.</p>
|
||||||
<p>This started as a convenience tool for humans. It became the single most important function in our stack. Every agent session — Cursor, Claude Code, Pi — starts with <code>tinqs identity</code>. Without it, every conversation begins with "let me explain the project." With it, the agent already knows.</p>
|
<p>This started as a convenience tool for humans. It became the single most important function in our stack. Every agent session — Cursor, Claude Code, Pi — starts with <code>tstudio identity</code>. Without it, every conversation begins with "let me explain the project." With it, the agent already knows.</p>
|
||||||
<h2>Screenshots and cloud vision</h2>
|
<h2>Screenshots and cloud vision</h2>
|
||||||
<p>The CLI can capture any window from outside the process. No in-game overlay, no rendering pipeline integration. OS-level capture — GDI+ on Windows, screencapture on Mac.</p>
|
<p>The CLI can capture any window from outside the process. No in-game overlay, no rendering pipeline integration. OS-level capture — GDI+ on Windows, screencapture on Mac.</p>
|
||||||
<p>A <code>photo</code> command sends the screenshot to a cloud vision model. The agent says "take a photo of the game" and gets back: "The player character is standing near a half-built hut. Three palm trees to the left. The terrain has a visible seam between two biomes."</p>
|
<p>A <code>photo</code> command sends the screenshot to a cloud vision model. The agent says "take a photo of the game" and gets back: "The player character is standing near a half-built hut. Three palm trees to the left. The terrain has a visible seam between two biomes."</p>
|
||||||
<p>This is how you file bugs without typing. Look at the game, tell the agent what's wrong. It takes a screenshot, describes what it sees, and creates an issue with both the description and the image attached. Keyboard-free bug reporting.</p>
|
<p>This is how you file bugs without typing. Look at the game, tell the agent what's wrong. It takes a screenshot, describes what it sees, and creates an issue with both the description and the image attached. Keyboard-free bug reporting.</p>
|
||||||
<h2>Health checks</h2>
|
<h2>Health checks</h2>
|
||||||
<p><code>tinqs doctor</code> runs a comprehensive check:</p>
|
<p><code>tstudio doctor</code> runs a comprehensive check:</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li>Is the git platform reachable and authenticated?</li>
|
<li>Is the git platform reachable and authenticated?</li>
|
||||||
<li>Is the game server running?</li>
|
<li>Is the game server running?</li>
|
||||||
@@ -294,7 +294,7 @@
|
|||||||
<p>Go compiles to a single static binary. No Python virtualenvs, no Node.js version managers, no DLL hell on Windows. The same binary runs on a gaming PC, a designer's MacBook, and a CI runner in AWS.</p>
|
<p>Go compiles to a single static binary. No Python virtualenvs, no Node.js version managers, no DLL hell on Windows. The same binary runs on a gaming PC, a designer's MacBook, and a CI runner in AWS.</p>
|
||||||
<p>Cross-compilation is trivial. We build Windows, Mac (arm64 + amd64), and Linux binaries from a single CI workflow. Push a tag, CI builds all three, uploads to S3. The binary is 15MB, starts in under 100ms, has zero runtime dependencies.</p>
|
<p>Cross-compilation is trivial. We build Windows, Mac (arm64 + amd64), and Linux binaries from a single CI workflow. Push a tag, CI builds all three, uploads to S3. The binary is 15MB, starts in under 100ms, has zero runtime dependencies.</p>
|
||||||
<h2>What we learned</h2>
|
<h2>What we learned</h2>
|
||||||
<p><strong>The CLI is the API for AI agents.</strong> What started as a human convenience tool became the primary interface for agents. Every session starts with <code>tinqs identity</code>. The agent's "hands and eyes" — screenshots, vision, health checks — are subcommands of the same binary.</p>
|
<p><strong>The CLI is the API for AI agents.</strong> What started as a human convenience tool became the primary interface for agents. Every session starts with <code>tstudio identity</code>. The agent's "hands and eyes" — screenshots, vision, health checks — are subcommands of the same binary.</p>
|
||||||
<p><strong>One binary beats ten scripts.</strong> Scripts rot. They have different shells, different PATH assumptions, different error handling. A compiled binary either works or it doesn't. It ships with dependencies baked in. It doesn't care if your Python is 3.9 or 3.12.</p>
|
<p><strong>One binary beats ten scripts.</strong> Scripts rot. They have different shells, different PATH assumptions, different error handling. A compiled binary either works or it doesn't. It ships with dependencies baked in. It doesn't care if your Python is 3.9 or 3.12.</p>
|
||||||
<p><strong>Cloud vision is underrated for game dev.</strong> Sending a screenshot to a vision model sounds gimmicky. In practice, it's the fastest way to document visual bugs. "The tree is floating 2m above the terrain" is much faster to communicate when the AI is looking at the same screen.</p>
|
<p><strong>Cloud vision is underrated for game dev.</strong> Sending a screenshot to a vision model sounds gimmicky. In practice, it's the fastest way to document visual bugs. "The tree is floating 2m above the terrain" is much faster to communicate when the AI is looking at the same screen.</p>
|
||||||
<p><strong>Agent cold starts are the real problem.</strong> Without the identity system, every session starts with the agent asking "what project is this?" With it, the agent knows everything in 100ms. That's the difference between an AI assistant and an AI team member.</p>
|
<p><strong>Agent cold starts are the real problem.</strong> Without the identity system, every session starts with the agent asking "what project is this?" With it, the agent knows everything in 100ms. That's the difference between an AI assistant and an AI team member.</p>
|
||||||
|
|||||||
|
Before
After
|
Reference in New Issue
Block a user