Compare commits
2 Commits
56795b680f
...
6cba781083
| Author | SHA1 | Date | |
|---|---|---|---|
| 6cba781083 | |||
| aaa788b29f |
@@ -163,7 +163,7 @@
|
||||
<a href="pi-flow-native-brain" class="blog-card">
|
||||
<span class="blog-card__date">4 June 2026</span>
|
||||
<h2 class="blog-card__title">How Pi Agents Build, Test, and Ship Game Code with Oracle-Backed Flows</h2>
|
||||
<p class="blog-card__excerpt">We type a slash command, agents fan out through five oracle gates, the game-builder fixes 19 red tests while vision judges check the live game — and it all runs as one autonomous flow.</p>
|
||||
<p class="blog-card__excerpt">A flow spawns, agents fan out through five oracle gates, the game-builder fixes 19 red tests while vision judges check the live game — and it all runs as one autonomous flow.</p>
|
||||
<span class="blog-card__read">Read →</span>
|
||||
</a>
|
||||
{{CARDS}}
|
||||
|
||||
|
Before
After
|
+2
-2
@@ -271,7 +271,7 @@
|
||||
<p><strong>Identity.</strong> Who the agent is, what it values, how it should behave. Not "you are a helpful assistant" — that's generic and unmoored. A soul file that says "you're working on Ariki, a survival colony sim. The team is four people. Never push to main without review. Prefer existing conventions." Identity creates consistency across sessions.</p>
|
||||
<p><strong>Memory.</strong> What happened last session. What decisions were made. What failed and why. Without memory, every conversation is a cold start — "let me explain the project..." Memory stored as markdown in git means it's version-controlled, diffable, and human-readable. When something goes wrong, you <code>git log</code> instead of debugging a vector database.</p>
|
||||
<p><strong>Tools.</strong> What the agent can actually do beyond generating text. A CLI that takes screenshots, checks service health, and loads project context. API wrappers for git, CI, image generation. Without tools, the agent is a very articulate oracle that can't touch anything.</p>
|
||||
<p><strong>Context.</strong> Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — <code>tstudio identity</code> — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"</p>
|
||||
<p><strong>Context.</strong> Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — <code>tinqs identity</code> — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"</p>
|
||||
<p><strong>Guardrails.</strong> What the agent must never do. No merging to main without review. No pushing to public repos without approval. No running destructive commands. The harness enforces these at the platform layer, not in the prompt. Prompts can be ignored. Platform gates cannot.</p>
|
||||
<h2>Why generic harnesses fail for game dev</h2>
|
||||
<p>LangChain, CrewAI, and AutoGen are built for web apps. They assume text-in, text-out. Game development is different in ways that break those assumptions:</p>
|
||||
@@ -281,7 +281,7 @@
|
||||
<p><strong>The team is small and cross-functional.</strong> Four people. No dedicated DevOps, no dedicated artist, no dedicated PM. The harness fills all those gaps, not just one.</p>
|
||||
<h2>The toolchain that makes it work</h2>
|
||||
<p>Our harness runs on <a href="https://tinqs.com" style="color: var(–c-accent-l);">Tinqs Studio</a>, built on a Gitea fork with game-specific features. The key pieces:</p>
|
||||
<p><strong>The CLI</strong> — a single Go binary. One command (<code>tstudio identity</code>) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.</p>
|
||||
<p><strong>The CLI</strong> — a single Go binary. One command (<code>tinqs identity</code>) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.</p>
|
||||
<p><strong>The soul file</strong> — a markdown document in the repo root. The agent reads it on session start. It defines values, scope, and behavioural rules. The same soul file works in Cursor, Claude Code, or any tool that reads markdown.</p>
|
||||
<p><strong>Skills</strong> — markdown playbooks for specific workflows. Image generation, concept art pipeline, 3D model creation, video generation. Each skill is a procedure the agent follows. Write once, use forever.</p>
|
||||
<p><strong>3D preview</strong> — click a <code>.glb</code> file in a PR and rotate the model in your browser. 22 formats supported. This alone transformed our review process — nobody approves a binary diff blind anymore.</p>
|
||||
|
||||
|
Before
After
|
+8
-1
@@ -163,7 +163,14 @@
|
||||
<a href="pi-flow-native-brain" class="blog-card">
|
||||
<span class="blog-card__date">4 June 2026</span>
|
||||
<h2 class="blog-card__title">How Pi Agents Build, Test, and Ship Game Code with Oracle-Backed Flows</h2>
|
||||
<p class="blog-card__excerpt">We type a slash command, agents fan out through five oracle gates, the game-builder fixes 19 red tests while vision judges check the live game — and it all runs as one autonomous flow.</p>
|
||||
<p class="blog-card__excerpt">A flow spawns, agents fan out through five oracle gates, the game-builder fixes 19 red tests while vision judges check the live game — and it all runs as one autonomous flow.</p>
|
||||
<span class="blog-card__read">Read →</span>
|
||||
</a>
|
||||
|
||||
<a href="voice-missing-input-game-dev" class="blog-card">
|
||||
<span class="blog-card__date">10 June 2026</span>
|
||||
<h2 class="blog-card__title">Why Voice Is the Missing Input for Game Development</h2>
|
||||
<p class="blog-card__excerpt">Speaking a bug while looking at the screen beats typing it from memory ten minutes later. Here's how voice-to-agent pipelines work, why game dev is the ideal use case, and what changes when you stop typing bug reports.</p>
|
||||
<span class="blog-card__read">Read →</span>
|
||||
</a>
|
||||
|
||||
|
||||
|
Before
After
|
+68
-70
@@ -372,11 +372,11 @@
|
||||
|
||||
<div class="callout">
|
||||
<span class="callout__kicker">The Kitchen ↔ Flows Analogy</span>
|
||||
<p><strong>The kitchen</strong> = Pi (the agent harness). <strong>The recipe</strong> = a flow YAML (the DAG). <strong>The line cooks</strong> = agents (each with a station and tools). <strong>The pass</strong> = the flow engine (routes finished work). <strong>The head chef's inspection</strong> = the five gates. <strong>The order ticket</strong> = a slash command. <strong>"Send it back!"</strong> = the fix loop.</p>
|
||||
<p><strong>The kitchen</strong> = Pi (the agent harness). <strong>The recipe</strong> = a JavaScript flow (<code>.flow.mjs</code>). <strong>The line cooks</strong> = agents (each with a station and tools). <strong>The pass</strong> = the flow engine (routes finished work). <strong>The head chef's inspection</strong> = the five gates. <strong>The order ticket</strong> = a spawn task or <code>tinqs flow run</code>. <strong>"Send it back!"</strong> = the fix loop.</p>
|
||||
</div>
|
||||
|
||||
<h2>What Happens When You Type a Slash Command</h2>
|
||||
<p>You type <code>/game-feature add a double-jump with cooldown</code> and hit enter. The ticket hits the kitchen. What follows is not one agent doing everything — it's a brigade running their stations.</p>
|
||||
<h2>What Happens When You Spawn a Flow</h2>
|
||||
<p>You run <code>tinqs flow run game-feature --task 'add a double-jump with cooldown'</code> or click Run Flow on the dashboard. The ticket hits the kitchen. What follows is not one agent doing everything — it's a brigade running their stations.</p>
|
||||
|
||||
<figure style="margin:28px 0;">
|
||||
<svg viewBox="0 0 920 350" role="img" aria-label="The verify-heavy flow: context, plan, implement, five gates, a Reflexion loop, and one judge" style="width:100%;height:auto;display:block;background:#0a0e14;border:1px solid #2a3340;border-radius:12px;font-family:'IBM Plex Sans',system-ui,sans-serif;">
|
||||
@@ -467,7 +467,7 @@
|
||||
</div>
|
||||
|
||||
<h2>Composability: Adding a New Station</h2>
|
||||
<p>A kitchen doesn't redesign the whole line when they add a new dish. They add a station. Same in flows. Started with three gates — build, test, vision. Behaviour and feel came later, each a single-file extension. Gates aren't hardcoded. They're sub-agents declared in YAML. Want a linting gate? Add a sub-agent with a linter. Security scan? Same pattern. Asset bundle size check? Write the tool, declare the agent, wire it in.</p>
|
||||
<p>A kitchen doesn't redesign the whole line when they add a new dish. They add a station. Same in flows. Started with three gates — build, test, vision. Behaviour and feel came later, each a single-file extension. Gates aren't hardcoded. They're sub-agents called from JavaScript flows. Want a linting gate? Add an <code>agent()</code> call with a linter. Security scan? Same pattern. Asset bundle size check? Write the tool, declare the agent, wire it in.</p>
|
||||
|
||||
<div class="callout callout--purple">
|
||||
<span class="callout__kicker">Self-Improving Kitchen</span>
|
||||
@@ -538,17 +538,17 @@
|
||||
|
||||
<div class="callout callout--amber">
|
||||
<span class="callout__kicker">Flow 1 · 4 June, 18:32</span>
|
||||
<p><strong>/deep-implement</strong> — "Build the tinqs-gitea-read extension: list_org_repos, read_repo_file, list_repo_dir, search_repos." Nine steps, 14 minutes. Verdict: <span class="gate gate--test">PASS</span>. 31/31 vitest tests green, zero new TypeScript errors, session-level caching, path traversal protection. Every <code>execute()</code> body fully wired — no stubs, no placeholders. Like a saucier who doesn't just list ingredients but actually makes the sauce.</p>
|
||||
<p><strong>deep-implement</strong> — "Build the tinqs-gitea-read extension: list_org_repos, read_repo_file, list_repo_dir, search_repos." Nine steps, 14 minutes. Verdict: <span class="gate gate--test">PASS</span>. 31/31 vitest tests green, zero new TypeScript errors, session-level caching, path traversal protection. Every <code>execute()</code> body fully wired — no stubs, no placeholders. Like a saucier who doesn't just list ingredients but actually makes the sauce.</p>
|
||||
</div>
|
||||
|
||||
<div class="callout callout--purple">
|
||||
<span class="callout__kicker">Flow 2 · 4 June, 19:04</span>
|
||||
<p><strong>/game-feature</strong> — "Make the player jump." Build: <span class="gate gate--build">PASS</span>. Tests: <span class="gate gate--test">PASS</span>. Behaviour/Feel/Visual: <span style="color:#f59e0b;">NOT RUN</span> — no live game instance was reachable. The flow didn't silently skip the visual gate. It <strong>hard-stopped</strong> and reported honestly: "FAIL — the feature has not been verified in-game." This is the kitchen saying: "The dish is cooked, but nobody tasted it. I'm not sending it out."</p>
|
||||
<p><strong>game-feature</strong> — "Make the player jump." Build: <span class="gate gate--build">PASS</span>. Tests: <span class="gate gate--test">PASS</span>. Behaviour/Feel/Visual: <span style="color:#f59e0b;">NOT RUN</span> — no live game instance was reachable. The flow didn't silently skip the visual gate. It <strong>hard-stopped</strong> and reported honestly: "FAIL — the feature has not been verified in-game." This is the kitchen saying: "The dish is cooked, but nobody tasted it. I'm not sending it out."</p>
|
||||
</div>
|
||||
|
||||
<div class="callout callout--amber">
|
||||
<span class="callout__kicker">Flow 3 · 4 June, 19:49</span>
|
||||
<p><strong>/cto-infra</strong> — "Synthesize cost, stability, and VCS research into an AWS architecture decision." Four research streams fed into one CTO agent. Output: 14 requirements mapped to specific decisions, cost-vs-stability tradeoffs resolved with dollar figures, EC2+EBS over Fargate+EFS, RDS Multi-AZ mandatory, S3+CloudFront for LFS. Like an executive chef reading four menu proposals, reconciling them into one service, and pricing every plate.</p>
|
||||
<p><strong>cto-infra</strong> — "Synthesize cost, stability, and VCS research into an AWS architecture decision." Four research streams fed into one CTO agent. Output: 14 requirements mapped to specific decisions, cost-vs-stability tradeoffs resolved with dollar figures, EC2+EBS over Fargate+EFS, RDS Multi-AZ mandatory, S3+CloudFront for LFS. Like an executive chef reading four menu proposals, reconciling them into one service, and pricing every plate.</p>
|
||||
</div>
|
||||
|
||||
<hr class="accent">
|
||||
@@ -556,61 +556,59 @@
|
||||
<h2>Dinner Rush Recovery: The Crash That Interrupted Service</h2>
|
||||
<p>Earlier today, a machine crash cut off a flow mid-stream — the kitchen lost power during dinner rush. Nineteen tests were left red. Contracts written, implementation half-done. Half-cooked dishes on every station.</p>
|
||||
|
||||
<p>I typed one slash command — the expediter reassembled the brigade:</p>
|
||||
<p>I spawned the same flow with a different task:</p>
|
||||
|
||||
<pre><code>/game-feature Finish the leftover jump & locomotion animation work — make the 19 FAILING tests GREEN.</code></pre>
|
||||
<pre><code>tinqs flow run game-feature --task 'Finish the leftover jump & locomotion animation work -- make the 19 FAILING tests GREEN.'</code></pre>
|
||||
|
||||
<p>What happened next: the team picked up exactly where the crash left off. Here's the recipe — the exact YAML that runs in production:</p>
|
||||
<p>What happened next: the team picked up exactly where the crash left off. Here's the recipe — the exact JavaScript that runs in production:</p>
|
||||
|
||||
<pre><code>name: game-feature
|
||||
description: Build a PLAYABLE game feature and prove it in the LIVE game.
|
||||
task_required: true
|
||||
<pre><code>// .pi/flows/flows/game-feature.flow.mjs
|
||||
export const meta = {
|
||||
name: "game-feature",
|
||||
description: "Build a PLAYABLE game feature and prove it in the LIVE game.",
|
||||
task_required: true
|
||||
};
|
||||
|
||||
steps:
|
||||
# G0: Pre-flight — validate vision CAN run before any build work
|
||||
- id: preflight
|
||||
agent: vision-preflight
|
||||
task: Check GEMINI_API_KEY is set AND game_frames reaches a live instance.
|
||||
If EITHER fails, STOP — vision is not optional.
|
||||
export default async function run({ task, flow }) {
|
||||
// G0: Pre-flight — validate vision CAN run before any build work
|
||||
await flow.agent("vision-preflight", {
|
||||
task: "Check GEMINI_API_KEY is set AND game_frames reaches a live instance."
|
||||
});
|
||||
|
||||
# Context + plan
|
||||
- id: context
|
||||
agent: project-context-reader
|
||||
blockedBy: [preflight]
|
||||
// Context + plan
|
||||
const context = await flow.agent("project-context-reader");
|
||||
const plan = await flow.agent("feature-planner", { context });
|
||||
|
||||
- id: plan
|
||||
agent: feature-planner
|
||||
blockedBy: [context]
|
||||
// TDD: write tests FIRST (different agent than implementer)
|
||||
const testSuite = await flow.agent("test-author", { plan });
|
||||
|
||||
# TDD: write tests FIRST (different agent than implementer)
|
||||
- id: test-author
|
||||
agent: test-author
|
||||
blockedBy: [plan]
|
||||
// Implement
|
||||
const source = await flow.agent("game-builder", { testSuite, plan });
|
||||
|
||||
- id: implement
|
||||
agent: game-builder
|
||||
blockedBy: [test-author]
|
||||
// G1–G5: Oracle gates run via parallel for speed
|
||||
const gates = await flow.parallel([
|
||||
flow.agent("build-verifier", { source }),
|
||||
flow.agent("test-runner", { source }),
|
||||
flow.agent("behavioral-prober", { source }),
|
||||
flow.agent("feel-judge", { source }),
|
||||
flow.agent("animation-vision-judge", { source })
|
||||
]);
|
||||
|
||||
# G1–G5: Oracle gates (build, tests, behaviour, feel, visual)
|
||||
- id: build → agent: build-verifier
|
||||
- id: tests → agent: test-runner
|
||||
- id: behavior → agent: behavioral-prober (drives LIVE game via drive_game)
|
||||
- id: feel → agent: feel-judge (apex, airtime, latency, rise/fall)
|
||||
- id: visual → agent: animation-vision-judge (multimodal gemini-2.5-flash)
|
||||
// Self-recurring fix-loop: bounded loop back to implement with evidence
|
||||
const MAX_RETRIES = 3;
|
||||
for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
|
||||
const decision = await flow.agent("flow-decision", { gates });
|
||||
if (decision.verdict === "pass") break;
|
||||
if (attempt === MAX_RETRIES) {
|
||||
const fixed = await flow.agent("game-builder", { source, failures: decision.evidence });
|
||||
}
|
||||
}
|
||||
|
||||
# Self-recurring fix-loop: bounded loop back to implement with evidence
|
||||
- id: fix-loop
|
||||
type: agent-loop-decision
|
||||
agent: flow-decision
|
||||
loop_target: implement
|
||||
exit_target: report
|
||||
max_iterations: 3
|
||||
// Final judge: one honest verdict
|
||||
return flow.agent("game-judge");
|
||||
}</code></pre>
|
||||
|
||||
# Final judge: one honest verdict
|
||||
- id: report
|
||||
agent: game-judge</code></pre>
|
||||
|
||||
<p>Eighteen steps, seven cooks, five inspection points, one head chef. Triggered by a single order ticket.</p>
|
||||
<p>Eight logical steps, seven cooks, five inspection points, one head chef. Triggered by a single spawn.</p>
|
||||
|
||||
<p>Here's how the brigade actually worked. The <strong>vision-preflight</strong> agent — the chef who checks the gas is on before anyone starts cooking — verified <code>GEMINI_API_KEY</code> was set and <code>game_frames</code> could reach the live game. Both green in under a second. Without this, the whole kitchen would prep for an hour only to discover the oven doesn't work.</p>
|
||||
|
||||
@@ -624,29 +622,29 @@ steps:
|
||||
|
||||
<div class="callout">
|
||||
<span class="callout__kicker">Not a Demo</span>
|
||||
<p>This flow is a file at <code>.pi/flows/flows/game-feature.yaml</code>. I trigger it by typing <code>/game-feature</code> in Pi. It dispatches agents, runs gates, loops on failures, reports a verdict. There is no dashboard with drag-and-drop. There is a YAML file and a slash command. That's the whole product.</p>
|
||||
<p>This flow is a file at <code>.pi/flows/flows/game-feature.flow.mjs</code>. I trigger it by running <code>tinqs flow run game-feature</code> or clicking Run Flow on the dashboard. It dispatches agents, runs gates, loops on failures, reports a verdict. The dashboard at <code>:33634</code> is the control plane — spawn, steer mid-run, inspect state. That's the whole product.</p>
|
||||
</div>
|
||||
|
||||
<hr class="accent">
|
||||
|
||||
<h2>The Menu: Flows Are Slash Commands</h2>
|
||||
<p>Every flow becomes a slash command — the menu you read to the expediter. <code>.pi/flows/flows/game-feature.yaml</code> → <code>/game-feature</code>. You don't invoke a pipeline from a terminal. You order a dish in conversation.</p>
|
||||
<h2>The Menu: Flows at Your Fingertips</h2>
|
||||
<p>Every flow lives in <code>.pi/flows/flows/*.flow.mjs</code> and is spawnable by name. You run <code>tinqs flow run <name> [task]</code> or click Run Flow on the dashboard.</p>
|
||||
|
||||
<p>"Add wall-running" is not a CLI flag. It's natural language. The flow reads it, wires it through the agents, routes it through the gates. The YAML is the recipe. The conversation is the context.</p>
|
||||
<p>"Add wall-running" becomes the task argument. The flow reads it, wires it through the agents, routes it through the gates. The JavaScript is the recipe. The conversation provides the context.</p>
|
||||
|
||||
<p>The menu I call from daily:</p>
|
||||
|
||||
<ul>
|
||||
<li><strong>/game-feature</strong> — "add a double-jump" or "fix the 19 red tests" → brigade assembles, cooks, inspects, plates</li>
|
||||
<li><strong>/deep-implement</strong> — "build the gitea-read extension" → research → plan → implement → test → review → judge</li>
|
||||
<li><strong>/cto-infra</strong> — "reconcile cost, stability, and VCS research into architecture decisions" → 4 research streams → 1 synthesis agent → 14 requirements mapped to decisions</li>
|
||||
<li><strong>/flows:new</strong> — "I need a flow that..." → the Flow Architect reads the agent catalog, selects cooks, designs the recipe, writes the YAML</li>
|
||||
<li><strong>game-feature</strong> — "add a double-jump" or "fix the 19 red tests" → brigade assembles, cooks, inspects, plates</li>
|
||||
<li><strong>deep-implement</strong> — "build the gitea-read extension" → research → plan → implement → test → review → judge</li>
|
||||
<li><strong>cto-infra</strong> — "reconcile cost, stability, and VCS research into architecture decisions" → 4 research streams → 1 synthesis agent → 14 requirements mapped to decisions</li>
|
||||
<li><strong>flows:new</strong> — "I need a flow that..." → the Flow Architect reads the agent catalog, selects cooks, designs the recipe, writes the <code>.flow.mjs</code></li>
|
||||
</ul>
|
||||
|
||||
<h2>The Pass: How Agents Hand Off Work</h2>
|
||||
<p>In a real kitchen, cooks don't shout instructions across the room. They place finished plates on the pass. The expediter reads the ticket, checks the plate, routes it to the next station or to the dining room. Nobody yells. Nobody grabs someone else's pan.</p>
|
||||
|
||||
<p>Flows work the same way. Agents never talk to each other directly. When the game-builder finishes, it doesn't ping the test-runner. It calls <code>finish({ summary: "...", artifacts: "...", files: "..." })</code> — placing its work on the pass. The flow engine — the expediter — records it and routes it. The next agent receives exactly the inputs wired in the YAML: <code>${{result.game-builder.summary}}</code>, <code>${{result.game-builder.files}}</code>.</p>
|
||||
<p>Flows work the same way. Agents never talk to each other directly. When the game-builder finishes, it returns a result object — placing its work on the pass. The flow engine — the expediter — records it and routes it. The next agent receives the return value directly from <code>await flow.agent("game-builder")</code>.</p>
|
||||
|
||||
<div class="kitchen-grid">
|
||||
<div class="kitchen-col">
|
||||
@@ -655,11 +653,11 @@ steps:
|
||||
</div>
|
||||
<div class="kitchen-col">
|
||||
<span class="kitchen-col__title kitchen-col__title--reality">What Actually Happens</span>
|
||||
<p>Agent A → <code>finish({verdict: "pass", findings: ["coyote_time=100ms"]})</code> → engine records → Agent B receives <code>${{result.A.findings}}</code> via <code>inputs:</code> block. No chatter. Structured handoff.</p>
|
||||
<p>Agent A returns <code>{ verdict: "pass", findings: ["coyote_time=100ms"] }</code> → flow engine records it → Agent B receives the result as a direct return value of <code>await flow.agent("A")</code>. No chatter. Structured handoff.</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<p>Why? Because unstructured chatter is how hallucination cascades start. Agent A confidently states something wrong. Agent B builds on it. Agent C compounds it. Three agents later, they're collectively wrong about a file that doesn't exist, and nobody can trace where the error came from. The pass — structured result-passing with typed outputs — makes every handoff auditable, verifiable, and debuggable.</p>
|
||||
<p>Why? Because unstructured chatter is how hallucination cascades start. Agent A confidently states something wrong. Agent B builds on it. Agent C compounds it. Three agents later, they're collectively wrong about a file that doesn't exist, and nobody can trace where the error came from. The pass — structured result-passing via typed return values from each <code>agent()</code> call — makes every handoff auditable, verifiable, and debuggable.</p>
|
||||
|
||||
<p>Pi itself is built for solo interactive work: you ask, it does, you review. The orchestration layer I wrote on top inverts that. Pi becomes the kitchen. The flow engine becomes the expediter. Agents become line cooks who place plates on the pass, never shouting across the room.</p>
|
||||
|
||||
@@ -700,17 +698,17 @@ outputs: [summary, files]
|
||||
You are a game developer. Task: ${{task}}
|
||||
Context: ${{input.context}}</code></pre>
|
||||
|
||||
<p><strong style="color:#f59e0b;">Flows</strong> are YAML DAGs that wire agents together. I have about <strong>15–20 flows</strong> running across different domains:</p>
|
||||
<p><strong style="color:#f59e0b;">Flows</strong> are JavaScript modules (<code>.flow.mjs</code>) that coordinate agents with real control flow. I have about <strong>15–20 flows</strong> running across different domains:</p>
|
||||
|
||||
<ul>
|
||||
<li><strong>Game dev:</strong> /game-feature, /review, /bug-hunt, /refactor</li>
|
||||
<li><strong>Design:</strong> /concept-art, /sound-design (plans → ElevenLabs generation → judge evaluates with other models)</li>
|
||||
<li><strong>Marketing:</strong> /brand-image, /trailer-clip (Sora 2 video generation → vision judge)</li>
|
||||
<li><strong>Infra:</strong> /ci-fix, /deploy-check, /tstudio-jobs (action runners on AWS Lambda, workspace management)</li>
|
||||
<li><strong>Game dev:</strong> game-feature, review, bug-hunt, refactor</li>
|
||||
<li><strong>Design:</strong> concept-art, sound-design (plans → ElevenLabs generation → judge evaluates with other models)</li>
|
||||
<li><strong>Marketing:</strong> brand-image, trailer-clip (Sora 2 video generation → vision judge)</li>
|
||||
<li><strong>Infra:</strong> ci-fix, deploy-check, tinqs-jobs (action runners on AWS Lambda, workspace management)</li>
|
||||
<li><strong>Meta:</strong> A flow that periodically reads and improves the other flows — yes, flows that edit flows</li>
|
||||
</ul>
|
||||
|
||||
<p>The setup is not a product you install. It's a stack: Pi as the agent harness, custom extensions as the tool layer, markdown agents as the role layer, YAML flows as the orchestration layer. The whole thing lives in <code>.pi/flows/</code>. Version-controlled. CI-tested. Slash-command invoked.</p>
|
||||
<p>The setup is not a product you install. It's a stack: Pi as the agent harness, custom extensions as the tool layer, markdown agents as the role layer, JavaScript flows as the orchestration layer. The whole thing lives in <code>.pi/flows/</code>. Version-controlled. CI-tested. Spawned via <code>tinqs flow run</code> or the dashboard.</p>
|
||||
|
||||
<h2>The Recipe vs. The Technique</h2>
|
||||
<p>"Do you define the process with these trees, or do the agents freestyle?" Both. The recipe says what to make and in what order. The technique is how each cook executes their station.</p>
|
||||
@@ -718,7 +716,7 @@ Context: ${{input.context}}</code></pre>
|
||||
<div class="kitchen-grid">
|
||||
<div class="kitchen-col">
|
||||
<span class="kitchen-col__title kitchen-col__title--kitchen">The Recipe (Rigid)</span>
|
||||
<p>The flow YAML is the recipe. It says: first the prep cook dices onions, then the saucier makes the base, then the grill cook sears the protein. After every station, the plate hits the pass for inspection. <strong>This order is not negotiable.</strong> A cook cannot skip the inspection because they feel confident. The inspection runs. Period.</p>
|
||||
<p>The flow's JavaScript is the recipe. It says: first the prep cook dices onions, then the saucier makes the base, then the grill cook sears the protein. After every station, the plate hits the pass for inspection. <strong>This order is not negotiable.</strong> A cook cannot skip the inspection because they feel confident. The inspection runs. Period.</p>
|
||||
</div>
|
||||
<div class="kitchen-col">
|
||||
<span class="kitchen-col__title kitchen-col__title--reality">The Technique (Autonomous)</span>
|
||||
@@ -730,7 +728,7 @@ Context: ${{input.context}}</code></pre>
|
||||
|
||||
<div class="callout callout--purple">
|
||||
<span class="callout__kicker">The Meta-Kitchen</span>
|
||||
<p>And when a recipe is wrong? Another flow improves it. A meta-flow reads performance data, spots bottlenecks — "the feel gate keeps failing because the cook doesn't know the jump velocity threshold" — edits the YAML to wire that threshold into the builder's inputs, and commits the change. <strong>Flows that edit flows.</strong> The kitchen that renovates itself between services.</p>
|
||||
<p>And when a recipe is wrong? Another flow improves it. A meta-flow reads performance data, spots bottlenecks — "the feel gate keeps failing because the cook doesn't know the jump velocity threshold" — edits the <code>.flow.mjs</code> to pass that threshold into the builder's inputs, and commits the change. <strong>Flows that edit flows.</strong> The kitchen that renovates itself between services.</p>
|
||||
</div>
|
||||
|
||||
<hr class="accent">
|
||||
|
||||
|
Before
After
|
@@ -24,7 +24,7 @@ Every agent harness, regardless of domain, needs five things:
|
||||
|
||||
**Tools.** What the agent can actually do beyond generating text. A CLI that takes screenshots, checks service health, and loads project context. API wrappers for git, CI, image generation. Without tools, the agent is a very articulate oracle that can't touch anything.
|
||||
|
||||
**Context.** Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — `tstudio identity` — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"
|
||||
**Context.** Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — `tinqs identity` — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"
|
||||
|
||||
**Guardrails.** What the agent must never do. No merging to main without review. No pushing to public repos without approval. No running destructive commands. The harness enforces these at the platform layer, not in the prompt. Prompts can be ignored. Platform gates cannot.
|
||||
|
||||
@@ -44,7 +44,7 @@ LangChain, CrewAI, and AutoGen are built for web apps. They assume text-in, text
|
||||
|
||||
Our harness runs on [Tinqs Studio](https://tinqs.com), built on a Gitea fork with game-specific features. The key pieces:
|
||||
|
||||
**The CLI** — a single Go binary. One command (`tstudio identity`) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.
|
||||
**The CLI** — a single Go binary. One command (`tinqs identity`) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.
|
||||
|
||||
**The soul file** — a markdown document in the repo root. The agent reads it on session start. It defines values, scope, and behavioural rules. The same soul file works in Cursor, Claude Code, or any tool that reads markdown.
|
||||
|
||||
|
||||
+5
-5
@@ -12,11 +12,11 @@ author_role: "CTO & Developer, Tinqs"
|
||||
---
|
||||
Every AI agent session starts the same way: cold. The agent doesn't know what project this is, who's asking, what tools are available, or what happened yesterday. You spend the first five minutes re-explaining context.
|
||||
|
||||
Our CLI solves this in 100ms. One command — `tstudio identity` — and the agent knows everything. The binary is 15MB, has zero runtime dependencies, and runs on every machine in the studio.
|
||||
Our CLI solves this in 100ms. One command — `tinqs identity` — and the agent knows everything. The binary is 15MB, has zero runtime dependencies, and runs on every machine in the studio.
|
||||
|
||||
## The identity command (100ms)
|
||||
|
||||
When an agent starts, the first thing it calls is `tstudio identity`. The output:
|
||||
When an agent starts, the first thing it calls is `tinqs identity`. The output:
|
||||
|
||||
- **Soul file** — the agent's persistent identity, values, operating principles
|
||||
- **Company context** — team members, roles, what the company does
|
||||
@@ -26,7 +26,7 @@ When an agent starts, the first thing it calls is `tstudio identity`. The output
|
||||
|
||||
This data lives in markdown files in the docs repo. Any machine on the network can read it. The agent goes from blank to fully contextual in under a second.
|
||||
|
||||
This started as a convenience tool for humans. It became the single most important function in our stack. Every agent session — Cursor, Claude Code, Pi — starts with `tstudio identity`. Without it, every conversation begins with "let me explain the project." With it, the agent already knows.
|
||||
This started as a convenience tool for humans. It became the single most important function in our stack. Every agent session — Cursor, Claude Code, Pi — starts with `tinqs identity`. Without it, every conversation begins with "let me explain the project." With it, the agent already knows.
|
||||
|
||||
## Screenshots and cloud vision
|
||||
|
||||
@@ -38,7 +38,7 @@ This is how you file bugs without typing. Look at the game, tell the agent what'
|
||||
|
||||
## Health checks
|
||||
|
||||
`tstudio doctor` runs a comprehensive check:
|
||||
`tinqs doctor` runs a comprehensive check:
|
||||
|
||||
- Is the git platform reachable and authenticated?
|
||||
- Is the game server running?
|
||||
@@ -55,7 +55,7 @@ Cross-compilation is trivial. We build Windows, Mac (arm64 + amd64), and Linux b
|
||||
|
||||
## What we learned
|
||||
|
||||
**The CLI is the API for AI agents.** What started as a human convenience tool became the primary interface for agents. Every session starts with `tstudio identity`. The agent's "hands and eyes" — screenshots, vision, health checks — are subcommands of the same binary.
|
||||
**The CLI is the API for AI agents.** What started as a human convenience tool became the primary interface for agents. Every session starts with `tinqs identity`. The agent's "hands and eyes" — screenshots, vision, health checks — are subcommands of the same binary.
|
||||
|
||||
**One binary beats ten scripts.** Scripts rot. They have different shells, different PATH assumptions, different error handling. A compiled binary either works or it doesn't. It ships with dependencies baked in. It doesn't care if your Python is 3.9 or 3.12.
|
||||
|
||||
|
||||
@@ -0,0 +1,111 @@
|
||||
---
|
||||
title: "Why Voice Is the Missing Input for Game Development"
|
||||
slug: voice-missing-input-game-dev
|
||||
date: "2026-06-10"
|
||||
description: "Speaking a bug while you're looking at the screen beats typing it from memory ten minutes later. Voice-to-agent pipelines collapse the gap between noticing a problem and tracking it — and game dev is the perfect use case."
|
||||
og_description: "Voice is the missing input for game dev — speak bugs while you play, let agents file them."
|
||||
og_image: "https://www.tinqs.com/img/og-cover.jpg"
|
||||
excerpt: "Speaking a bug while looking at the screen beats typing it from memory ten minutes later. Here's how voice-to-agent pipelines work, why game dev is the ideal use case, and what changes when you stop typing bug reports."
|
||||
author: "Ozan Bozkurt"
|
||||
author_initials: "OB"
|
||||
author_role: "CTO & Developer, Tinqs"
|
||||
---
|
||||
Every game developer knows this moment. You're playtesting, running through the world, and you see something wrong — a tree floating two meters above the terrain, a UI element clipping, an animation that stutters on frame 14. You make a mental note. Ten minutes later, back at the editor, you try to file it. The coordinates are fuzzy. The exact reproduction steps are gone. You type something vague like "tree floating on west beach maybe" and hope you remember more tomorrow.
|
||||
|
||||
Voice changes this entirely. Speak the bug while you're looking at it, and an agent turns your words into a structured issue — with a screenshot, a vision-model description, coordinates, and a severity estimate. No keyboard. No context switch. No memory loss.
|
||||
|
||||
## The latency that kills bug reports
|
||||
|
||||
The distance between seeing a bug and filing it is a memory decay curve. Every second that passes, your recollection loses precision:
|
||||
|
||||
| Elapsed time | What you remember |
|
||||
|---|---|
|
||||
| 0 seconds | Exact position, camera angle, what you were doing, what's on screen |
|
||||
| 30 seconds | "There was a tree... somewhere west... maybe floating?" |
|
||||
| 5 minutes | "I think there was a rendering issue? Or was it yesterday?" |
|
||||
|
||||
Typed bug reports are reconstructions from decaying memory. Voice bug reports are real-time captures. The difference in quality isn't marginal — it's the difference between a fix you can act on immediately and a ticket that sits in the backlog for three months while someone tries to reproduce it.
|
||||
|
||||
## The pipeline: voice → text → structured issue
|
||||
|
||||
Here's what actually happens when you speak a bug during playtesting:
|
||||
|
||||
```
|
||||
1. You speak: "There's a tree floating two meters above the terrain
|
||||
on the west beach, near the big rock formation. Happens after
|
||||
the vegetation culling pass kicks in around sunset."
|
||||
|
||||
2. Microphone → transcription (Whisper, local or API, ~500ms)
|
||||
|
||||
3. Transcription → agent context window (~100ms)
|
||||
|
||||
4. Agent parses the raw text and extracts:
|
||||
- What: tree floating above terrain
|
||||
- Where: west beach, near rock formation (camera coordinates auto-captured)
|
||||
- When: after vegetation culling, sunset
|
||||
- Severity: medium (visual, not blocking)
|
||||
- Screenshot: captured from the running game engine
|
||||
|
||||
5. Agent files a structured issue with all of the above,
|
||||
tags the rendering engineer, and posts the digest to team chat.
|
||||
|
||||
Total latency: under 2 seconds. You keep playing.
|
||||
```
|
||||
|
||||
This isn't theoretical. The pipeline runs on our own game project, and it's caught bugs that would have slipped through playtesting entirely — the ones you see, make a mental note about, and forget by the time you alt-tab.
|
||||
|
||||
## Why game dev is the perfect voice use case
|
||||
|
||||
**You're already looking at the screen.** Voice input doesn't require switching windows or breaking flow. You're playtesting — your hands are on the controller or WASD, your eyes are on the game. Speaking is the only input channel that doesn't interrupt the thing you're actually doing.
|
||||
|
||||
**Game bugs are spatial and visual.** "The crafting UI text overflows on items with names longer than 20 characters" is something you see, not something you calculate. Describing it verbally while looking at it produces a far richer bug report than typing from memory.
|
||||
|
||||
**Reproduction is half the battle.** When you speak the bug at the moment of occurrence, you naturally include the context: what you were doing, what just happened, what the game state was. You don't have to reconstruct it later.
|
||||
|
||||
**Voice scales to the whole team.** Artists see visual bugs. Designers see balance issues. Producers see UX friction. Not everyone on a game team is a fast typist or comfortable with issue trackers. Everyone can speak.
|
||||
|
||||
## What the agent adds beyond transcription
|
||||
|
||||
Raw transcription is useful — it's a notepad you don't have to type. But the agent layer is what makes voice input a pipeline rather than a dictation tool:
|
||||
|
||||
**Screenshot coordination.** The agent calls the game engine's HTTP API, captures the current frame, and attaches it to the issue. You don't take screenshots. The agent does.
|
||||
|
||||
**Vision model description.** The screenshot goes through a vision model that writes a text description of what's on screen. Future-you searching the issue tracker for "floating tree" finds it even if the transcription was garbled.
|
||||
|
||||
**Coordinates and context.** The game engine provides the player's world position, camera angle, and current game state. The agent bakes these into the issue. A developer can teleport directly to the bug location.
|
||||
|
||||
**Severity and routing.** The agent estimates severity from context ("floating" is visual, "crash" is critical) and tags the right team member. An artist doesn't get pinged for a shader bug. A rendering engineer doesn't get pinged for a UI text overflow.
|
||||
|
||||
## The numbers
|
||||
|
||||
| Method | Time from observation to filed issue | Information loss |
|
||||
|---|---|---|
|
||||
| Mental note → type later | 5-30 minutes | High (positions, steps, context) |
|
||||
| Alt-tab → type immediately | 30-60 seconds | Medium (screenshots missed, flow broken) |
|
||||
| Voice → agent pipeline | 2 seconds | Low (screenshot + position captured automatically) |
|
||||
|
||||
The throughput difference compounds. A 30-minute playtest session with keyboard-only bug filing might yield 3-4 issues, half of them vague. The same session with voice-to-agent produces 10-15 issues, all with screenshots, positions, and reproduction context.
|
||||
|
||||
## Setup is simpler than you think
|
||||
|
||||
You need three things, all of which you probably already have:
|
||||
|
||||
1. **A microphone.** The one in your headset is fine. Transcription models handle suboptimal audio surprisingly well.
|
||||
2. **Transcription.** Whisper runs locally and is free. Cloud APIs are sub-cent per minute. Both work.
|
||||
3. **An agent that speaks your game engine's API.** If your engine has an HTTP interface for screenshots and game state, the agent can wire the rest together. If it doesn't — add one. It's a weekend project.
|
||||
|
||||
The agent itself doesn't need to be custom-built. Any coding agent with tool access can be told "watch the game, transcribe voice input, file issues in the tracker." It's a skill file, not a product.
|
||||
|
||||
## What changes when you stop typing bugs
|
||||
|
||||
The most surprising effect isn't the speed. It's the coverage. When filing a bug costs two seconds of speaking, you file bugs you would have previously ignored. The minor visual glitch. The slight animation hitch. The UI element that's two pixels misaligned.
|
||||
|
||||
Individually these are low-priority. Collectively they're the difference between a game that feels polished and one that feels rough. And they only get caught when the cost of reporting approaches zero.
|
||||
|
||||
The second effect is that playtesting becomes a primary input channel. Instead of structured QA sessions with checklists and forms, you just play the game. The agent captures everything. When you're done, you have a list of filed issues with screenshots and context — generated from your spoken observations in real time.
|
||||
|
||||
Voice isn't a gimmick for game development. It's the input channel that matches the way we actually work — looking at the screen, noticing things, and talking about them. The tools exist. The latency is sub-second. The cost is negligible. The only thing missing is the habit.
|
||||
|
||||
---
|
||||
|
||||
*We build [Tinqs Studio](https://tinqs.com) — a game dev platform with built-in AI agents, git hosting, and creative pipelines. [Ariki](https://arikigame.com) is the survival colony sim we're building with every tool described here.*
|
||||
+5
-5
@@ -265,9 +265,9 @@
|
||||
<p class="post__lead">Every AI agent session starts the same way: cold. The agent doesn't know what project this is, who's asking, what tools are available, or what happened yesterday. You spend the first five minutes re-explaining context.</p>
|
||||
|
||||
<div class="post__body">
|
||||
<p>Our CLI solves this in 100ms. One command — <code>tstudio identity</code> — and the agent knows everything. The binary is 15MB, has zero runtime dependencies, and runs on every machine in the studio.</p>
|
||||
<p>Our CLI solves this in 100ms. One command — <code>tinqs identity</code> — and the agent knows everything. The binary is 15MB, has zero runtime dependencies, and runs on every machine in the studio.</p>
|
||||
<h2>The identity command (100ms)</h2>
|
||||
<p>When an agent starts, the first thing it calls is <code>tstudio identity</code>. The output:</p>
|
||||
<p>When an agent starts, the first thing it calls is <code>tinqs identity</code>. The output:</p>
|
||||
<ul>
|
||||
<li><strong>Soul file</strong> — the agent's persistent identity, values, operating principles</li>
|
||||
<li><strong>Company context</strong> — team members, roles, what the company does</li>
|
||||
@@ -276,13 +276,13 @@
|
||||
<li><strong>Service status</strong> — which URLs are live and reachable</li>
|
||||
</ul>
|
||||
<p>This data lives in markdown files in the docs repo. Any machine on the network can read it. The agent goes from blank to fully contextual in under a second.</p>
|
||||
<p>This started as a convenience tool for humans. It became the single most important function in our stack. Every agent session — Cursor, Claude Code, Pi — starts with <code>tstudio identity</code>. Without it, every conversation begins with "let me explain the project." With it, the agent already knows.</p>
|
||||
<p>This started as a convenience tool for humans. It became the single most important function in our stack. Every agent session — Cursor, Claude Code, Pi — starts with <code>tinqs identity</code>. Without it, every conversation begins with "let me explain the project." With it, the agent already knows.</p>
|
||||
<h2>Screenshots and cloud vision</h2>
|
||||
<p>The CLI can capture any window from outside the process. No in-game overlay, no rendering pipeline integration. OS-level capture — GDI+ on Windows, screencapture on Mac.</p>
|
||||
<p>A <code>photo</code> command sends the screenshot to a cloud vision model. The agent says "take a photo of the game" and gets back: "The player character is standing near a half-built hut. Three palm trees to the left. The terrain has a visible seam between two biomes."</p>
|
||||
<p>This is how you file bugs without typing. Look at the game, tell the agent what's wrong. It takes a screenshot, describes what it sees, and creates an issue with both the description and the image attached. Keyboard-free bug reporting.</p>
|
||||
<h2>Health checks</h2>
|
||||
<p><code>tstudio doctor</code> runs a comprehensive check:</p>
|
||||
<p><code>tinqs doctor</code> runs a comprehensive check:</p>
|
||||
<ul>
|
||||
<li>Is the git platform reachable and authenticated?</li>
|
||||
<li>Is the game server running?</li>
|
||||
@@ -294,7 +294,7 @@
|
||||
<p>Go compiles to a single static binary. No Python virtualenvs, no Node.js version managers, no DLL hell on Windows. The same binary runs on a gaming PC, a designer's MacBook, and a CI runner in AWS.</p>
|
||||
<p>Cross-compilation is trivial. We build Windows, Mac (arm64 + amd64), and Linux binaries from a single CI workflow. Push a tag, CI builds all three, uploads to S3. The binary is 15MB, starts in under 100ms, has zero runtime dependencies.</p>
|
||||
<h2>What we learned</h2>
|
||||
<p><strong>The CLI is the API for AI agents.</strong> What started as a human convenience tool became the primary interface for agents. Every session starts with <code>tstudio identity</code>. The agent's "hands and eyes" — screenshots, vision, health checks — are subcommands of the same binary.</p>
|
||||
<p><strong>The CLI is the API for AI agents.</strong> What started as a human convenience tool became the primary interface for agents. Every session starts with <code>tinqs identity</code>. The agent's "hands and eyes" — screenshots, vision, health checks — are subcommands of the same binary.</p>
|
||||
<p><strong>One binary beats ten scripts.</strong> Scripts rot. They have different shells, different PATH assumptions, different error handling. A compiled binary either works or it doesn't. It ships with dependencies baked in. It doesn't care if your Python is 3.9 or 3.12.</p>
|
||||
<p><strong>Cloud vision is underrated for game dev.</strong> Sending a screenshot to a vision model sounds gimmicky. In practice, it's the fastest way to document visual bugs. "The tree is floating 2m above the terrain" is much faster to communicate when the AI is looking at the same screen.</p>
|
||||
<p><strong>Agent cold starts are the real problem.</strong> Without the identity system, every session starts with the agent asking "what project is this?" With it, the agent knows everything in 100ms. That's the difference between an AI assistant and an AI team member.</p>
|
||||
|
||||
|
Before
After
|
@@ -0,0 +1,343 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
|
||||
<title>Why Voice Is the Missing Input for Game Development — Tinqs Blog</title>
|
||||
<meta name="description" content="Speaking a bug while you're looking at the screen beats typing it from memory ten minutes later. Voice-to-agent pipelines collapse the gap between noticing a problem and tracking it — and game dev is the perfect use case.">
|
||||
<meta name="robots" content="index, follow">
|
||||
<link rel="canonical" href="https://www.tinqs.com/blog/voice-missing-input-game-dev">
|
||||
|
||||
<meta property="og:type" content="article">
|
||||
<meta property="og:url" content="https://www.tinqs.com/blog/voice-missing-input-game-dev">
|
||||
<meta property="og:title" content="Why Voice Is the Missing Input for Game Development">
|
||||
<meta property="og:description" content="Voice is the missing input for game dev — speak bugs while you play, let agents file them.">
|
||||
<meta property="og:image" content="https://www.tinqs.com/img/og-cover.jpg">
|
||||
|
||||
<meta name="twitter:card" content="summary_large_image">
|
||||
<meta name="twitter:title" content="Why Voice Is the Missing Input for Game Development">
|
||||
<meta name="twitter:description" content="Voice is the missing input for game dev — speak bugs while you play, let agents file them.">
|
||||
<meta name="twitter:image" content="https://www.tinqs.com/img/og-cover.jpg">
|
||||
|
||||
<script type="application/ld+json">
|
||||
{
|
||||
"@context": "https://schema.org",
|
||||
"@type": "BlogPosting",
|
||||
"headline": "Why Voice Is the Missing Input for Game Development",
|
||||
"datePublished": "2026-06-10",
|
||||
"author": {
|
||||
"@type": "Person",
|
||||
"name": "Ozan Bozkurt"
|
||||
},
|
||||
"publisher": {
|
||||
"@type": "Organization",
|
||||
"name": "Tinqs Limited",
|
||||
"url": "https://www.tinqs.com"
|
||||
},
|
||||
"description": "Speaking a bug while you're looking at the screen beats typing it from memory ten minutes later. Voice-to-agent pipelines collapse the gap between noticing a problem and tracking it — and game dev is the perfect use case."
|
||||
}
|
||||
</script>
|
||||
|
||||
<style>
|
||||
/* ── Self-contained post styles (Studio provides site chrome) ── */
|
||||
|
||||
:root {
|
||||
--c-accent: #c9935a;
|
||||
--c-accent-l: #d4a87c;
|
||||
--c-bg: #0d1117;
|
||||
--c-text: #e6edf3;
|
||||
--c-muted: #9aa7b4;
|
||||
--c-border: #2a3340;
|
||||
--c-blue: #38bdf8;
|
||||
--c-purple: #a855f7;
|
||||
--c-gold: #f59e0b;
|
||||
--c-code-bg: #1c2230;
|
||||
--c-pre-bg: #0a0e14;
|
||||
}
|
||||
|
||||
*, *::before, *::after { box-sizing: border-box; }
|
||||
|
||||
body {
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
background: transparent;
|
||||
color: var(--c-text);
|
||||
font-family: system-ui, -apple-system, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif;
|
||||
line-height: 1.6;
|
||||
-webkit-font-smoothing: antialiased;
|
||||
}
|
||||
|
||||
/* ── Post container ── */
|
||||
.post {
|
||||
max-width: 720px;
|
||||
margin: 0 auto;
|
||||
padding: 40px 24px 48px;
|
||||
}
|
||||
|
||||
/* ── Back link ── */
|
||||
.post__back {
|
||||
color: var(--c-blue);
|
||||
text-decoration: none;
|
||||
font-size: 0.9rem;
|
||||
display: inline-block;
|
||||
margin-bottom: 24px;
|
||||
}
|
||||
.post__back:hover { color: var(--c-purple); }
|
||||
|
||||
/* ── Gradient title ── */
|
||||
.post__title {
|
||||
background: linear-gradient(90deg, #c9935a, #f59e0b 40%, #38bdf8);
|
||||
-webkit-background-clip: text;
|
||||
background-clip: text;
|
||||
color: transparent;
|
||||
font-weight: 800;
|
||||
font-size: 2.2rem;
|
||||
line-height: 1.25;
|
||||
margin: 0 0 16px;
|
||||
}
|
||||
|
||||
/* ── Date pill ── */
|
||||
.post__date {
|
||||
display: inline-block;
|
||||
font-family: ui-monospace, 'SF Mono', 'Cascadia Code', Consolas, monospace;
|
||||
font-size: 0.72rem;
|
||||
letter-spacing: 0.22em;
|
||||
text-transform: uppercase;
|
||||
color: var(--c-blue);
|
||||
border: 1px solid rgba(147, 140, 129, 0.25);
|
||||
border-radius: 999px;
|
||||
padding: 4px 14px;
|
||||
margin-bottom: 16px;
|
||||
}
|
||||
|
||||
/* ── Lead ── */
|
||||
.post__lead {
|
||||
color: var(--c-muted);
|
||||
font-size: 1.08rem;
|
||||
line-height: 1.7;
|
||||
}
|
||||
|
||||
/* ── Body ── */
|
||||
.post__body { font-size: 1rem; line-height: 1.7; }
|
||||
|
||||
.post__body p { margin: 14px 0; }
|
||||
|
||||
.post__body h2 {
|
||||
font-size: 1.7rem;
|
||||
margin: 54px 0 6px;
|
||||
padding-left: 16px;
|
||||
border-left: 4px solid var(--c-accent);
|
||||
line-height: 1.3;
|
||||
}
|
||||
|
||||
.post__body h3 {
|
||||
color: var(--c-purple);
|
||||
font-size: 1.18rem;
|
||||
margin: 30px 0 4px;
|
||||
}
|
||||
|
||||
.post__body h4, .post__body h5, .post__body h6 {
|
||||
margin: 20px 0 4px;
|
||||
}
|
||||
|
||||
/* ── Inline code ── */
|
||||
.post__body code {
|
||||
font-family: ui-monospace, 'SF Mono', 'Cascadia Code', Consolas, monospace;
|
||||
font-size: 0.86em;
|
||||
background: var(--c-code-bg);
|
||||
color: #9fe6c0;
|
||||
padding: 2px 6px;
|
||||
border-radius: 5px;
|
||||
border: 1px solid var(--c-border);
|
||||
}
|
||||
|
||||
/* ── Code blocks ── */
|
||||
.post__body pre {
|
||||
background: var(--c-pre-bg);
|
||||
border: 1px solid var(--c-border);
|
||||
border-radius: 10px;
|
||||
padding: 16px 18px;
|
||||
overflow-x: auto;
|
||||
margin: 14px 0;
|
||||
font-family: ui-monospace, 'SF Mono', 'Cascadia Code', Consolas, monospace;
|
||||
font-size: 0.85rem;
|
||||
line-height: 1.55;
|
||||
color: var(--c-text);
|
||||
}
|
||||
|
||||
.post__body pre code {
|
||||
background: transparent;
|
||||
padding: 0;
|
||||
border: none;
|
||||
font-size: inherit;
|
||||
color: inherit;
|
||||
border-radius: 0;
|
||||
}
|
||||
|
||||
/* ── Blockquote ── */
|
||||
.post__body blockquote {
|
||||
background: rgba(245, 158, 11, 0.08);
|
||||
border: 1px solid rgba(245, 158, 11, 0.25);
|
||||
border-left: 4px solid var(--c-gold);
|
||||
border-radius: 0 12px 12px 0;
|
||||
padding: 16px 18px;
|
||||
margin: 18px 0;
|
||||
color: #f4e3c4;
|
||||
font-size: 0.94rem;
|
||||
}
|
||||
|
||||
/* ── Links ── */
|
||||
.post__body a { color: var(--c-blue); }
|
||||
.post__body a:hover { color: var(--c-purple); }
|
||||
|
||||
/* ── Strong ── */
|
||||
.post__body strong { color: var(--c-gold); }
|
||||
|
||||
/* ── HR ── */
|
||||
.post__body hr {
|
||||
border: none;
|
||||
border-top: 1px solid var(--c-border);
|
||||
margin: 32px 0;
|
||||
}
|
||||
|
||||
/* ── Figures ── */
|
||||
.post__body figure { margin: 20px 0; }
|
||||
.post__body figure img {
|
||||
max-width: 100%;
|
||||
border-radius: 12px;
|
||||
border: 1px solid var(--c-border);
|
||||
}
|
||||
|
||||
.post__body figcaption {
|
||||
color: var(--c-muted);
|
||||
font-size: 0.85rem;
|
||||
margin-top: 6px;
|
||||
}
|
||||
|
||||
/* ── Lists ── */
|
||||
.post__body ul, .post__body ol { padding-left: 1.5em; margin: 10px 0; }
|
||||
.post__body li { margin: 4px 0; }
|
||||
|
||||
/* ── Author ── */
|
||||
.post__author {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 14px;
|
||||
margin-top: 48px;
|
||||
padding-top: 24px;
|
||||
border-top: 1px solid var(--c-border);
|
||||
}
|
||||
|
||||
.post__author-avatar {
|
||||
width: 48px;
|
||||
height: 48px;
|
||||
border-radius: 50%;
|
||||
background: var(--c-accent);
|
||||
color: var(--c-bg);
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
font-weight: 700;
|
||||
font-size: 0.85rem;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.post__author-info {
|
||||
font-size: 0.85rem;
|
||||
color: var(--c-muted);
|
||||
line-height: 1.4;
|
||||
}
|
||||
|
||||
.post__author-name {
|
||||
color: var(--c-text);
|
||||
font-weight: 600;
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<!-- POST -->
|
||||
<article class="post">
|
||||
<a href="/blog/" class="post__back">← All Posts</a>
|
||||
<span class="post__date">10 June 2026</span>
|
||||
<h1 class="post__title">Why Voice Is the Missing Input for Game Development</h1>
|
||||
<p class="post__lead">Every game developer knows this moment. You're playtesting, running through the world, and you see something wrong — a tree floating two meters above the terrain, a UI element clipping, an animation that stutters on frame 14. You make a mental note. Ten minutes later, back at the editor, you try to file it. The coordinates are fuzzy. The exact reproduction steps are gone. You type something vague like "tree floating on west beach maybe" and hope you remember more tomorrow.</p>
|
||||
|
||||
<div class="post__body">
|
||||
<p>Voice changes this entirely. Speak the bug while you're looking at it, and an agent turns your words into a structured issue — with a screenshot, a vision-model description, coordinates, and a severity estimate. No keyboard. No context switch. No memory loss.</p>
|
||||
<h2>The latency that kills bug reports</h2>
|
||||
<p>The distance between seeing a bug and filing it is a memory decay curve. Every second that passes, your recollection loses precision:</p>
|
||||
<p>| Elapsed time | What you remember |</p>
|
||||
<p>|—|—|</p>
|
||||
<p>| 0 seconds | Exact position, camera angle, what you were doing, what's on screen |</p>
|
||||
<p>| 30 seconds | "There was a tree... somewhere west... maybe floating?" |</p>
|
||||
<p>| 5 minutes | "I think there was a rendering issue? Or was it yesterday?" |</p>
|
||||
<p>Typed bug reports are reconstructions from decaying memory. Voice bug reports are real-time captures. The difference in quality isn't marginal — it's the difference between a fix you can act on immediately and a ticket that sits in the backlog for three months while someone tries to reproduce it.</p>
|
||||
<h2>The pipeline: voice → text → structured issue</h2>
|
||||
<p>Here's what actually happens when you speak a bug during playtesting:</p>
|
||||
<pre><code>1. You speak: "There's a tree floating two meters above the terrain
|
||||
on the west beach, near the big rock formation. Happens after
|
||||
the vegetation culling pass kicks in around sunset."
|
||||
|
||||
2. Microphone → transcription (Whisper, local or API, ~500ms)
|
||||
|
||||
3. Transcription → agent context window (~100ms)
|
||||
|
||||
4. Agent parses the raw text and extracts:
|
||||
- What: tree floating above terrain
|
||||
- Where: west beach, near rock formation (camera coordinates auto-captured)
|
||||
- When: after vegetation culling, sunset
|
||||
- Severity: medium (visual, not blocking)
|
||||
- Screenshot: captured from the running game engine
|
||||
|
||||
5. Agent files a structured issue with all of the above,
|
||||
tags the rendering engineer, and posts the digest to team chat.
|
||||
|
||||
Total latency: under 2 seconds. You keep playing.</code></pre>
|
||||
<p>This isn't theoretical. The pipeline runs on our own game project, and it's caught bugs that would have slipped through playtesting entirely — the ones you see, make a mental note about, and forget by the time you alt-tab.</p>
|
||||
<h2>Why game dev is the perfect voice use case</h2>
|
||||
<p><strong>You're already looking at the screen.</strong> Voice input doesn't require switching windows or breaking flow. You're playtesting — your hands are on the controller or WASD, your eyes are on the game. Speaking is the only input channel that doesn't interrupt the thing you're actually doing.</p>
|
||||
<p><strong>Game bugs are spatial and visual.</strong> "The crafting UI text overflows on items with names longer than 20 characters" is something you see, not something you calculate. Describing it verbally while looking at it produces a far richer bug report than typing from memory.</p>
|
||||
<p><strong>Reproduction is half the battle.</strong> When you speak the bug at the moment of occurrence, you naturally include the context: what you were doing, what just happened, what the game state was. You don't have to reconstruct it later.</p>
|
||||
<p><strong>Voice scales to the whole team.</strong> Artists see visual bugs. Designers see balance issues. Producers see UX friction. Not everyone on a game team is a fast typist or comfortable with issue trackers. Everyone can speak.</p>
|
||||
<h2>What the agent adds beyond transcription</h2>
|
||||
<p>Raw transcription is useful — it's a notepad you don't have to type. But the agent layer is what makes voice input a pipeline rather than a dictation tool:</p>
|
||||
<p><strong>Screenshot coordination.</strong> The agent calls the game engine's HTTP API, captures the current frame, and attaches it to the issue. You don't take screenshots. The agent does.</p>
|
||||
<p><strong>Vision model description.</strong> The screenshot goes through a vision model that writes a text description of what's on screen. Future-you searching the issue tracker for "floating tree" finds it even if the transcription was garbled.</p>
|
||||
<p><strong>Coordinates and context.</strong> The game engine provides the player's world position, camera angle, and current game state. The agent bakes these into the issue. A developer can teleport directly to the bug location.</p>
|
||||
<p><strong>Severity and routing.</strong> The agent estimates severity from context ("floating" is visual, "crash" is critical) and tags the right team member. An artist doesn't get pinged for a shader bug. A rendering engineer doesn't get pinged for a UI text overflow.</p>
|
||||
<h2>The numbers</h2>
|
||||
<p>| Method | Time from observation to filed issue | Information loss |</p>
|
||||
<p>|—|—|—|</p>
|
||||
<p>| Mental note → type later | 5-30 minutes | High (positions, steps, context) |</p>
|
||||
<p>| Alt-tab → type immediately | 30-60 seconds | Medium (screenshots missed, flow broken) |</p>
|
||||
<p>| Voice → agent pipeline | 2 seconds | Low (screenshot + position captured automatically) |</p>
|
||||
<p>The throughput difference compounds. A 30-minute playtest session with keyboard-only bug filing might yield 3-4 issues, half of them vague. The same session with voice-to-agent produces 10-15 issues, all with screenshots, positions, and reproduction context.</p>
|
||||
<h2>Setup is simpler than you think</h2>
|
||||
<p>You need three things, all of which you probably already have:</p>
|
||||
<p>1. <strong>A microphone.</strong> The one in your headset is fine. Transcription models handle suboptimal audio surprisingly well.</p>
|
||||
<p>2. <strong>Transcription.</strong> Whisper runs locally and is free. Cloud APIs are sub-cent per minute. Both work.</p>
|
||||
<p>3. <strong>An agent that speaks your game engine's API.</strong> If your engine has an HTTP interface for screenshots and game state, the agent can wire the rest together. If it doesn't — add one. It's a weekend project.</p>
|
||||
<p>The agent itself doesn't need to be custom-built. Any coding agent with tool access can be told "watch the game, transcribe voice input, file issues in the tracker." It's a skill file, not a product.</p>
|
||||
<h2>What changes when you stop typing bugs</h2>
|
||||
<p>The most surprising effect isn't the speed. It's the coverage. When filing a bug costs two seconds of speaking, you file bugs you would have previously ignored. The minor visual glitch. The slight animation hitch. The UI element that's two pixels misaligned.</p>
|
||||
<p>Individually these are low-priority. Collectively they're the difference between a game that feels polished and one that feels rough. And they only get caught when the cost of reporting approaches zero.</p>
|
||||
<p>The second effect is that playtesting becomes a primary input channel. Instead of structured QA sessions with checklists and forms, you just play the game. The agent captures everything. When you're done, you have a list of filed issues with screenshots and context — generated from your spoken observations in real time.</p>
|
||||
<p>Voice isn't a gimmick for game development. It's the input channel that matches the way we actually work — looking at the screen, noticing things, and talking about them. The tools exist. The latency is sub-second. The cost is negligible. The only thing missing is the habit.</p>
|
||||
<hr>
|
||||
<p><em>We build <a href="https://tinqs.com" style="color: var(–c-accent-l);">Tinqs Studio</a> — a game dev platform with built-in AI agents, git hosting, and creative pipelines. <a href="https://arikigame.com" style="color: var(–c-accent-l);">Ariki</a> is the survival colony sim we're building with every tool described here.</em></p>
|
||||
|
||||
</div>
|
||||
|
||||
<div class="post__author">
|
||||
<div class="post__author-avatar">OB</div>
|
||||
<div class="post__author-info">
|
||||
<span class="post__author-name">Ozan Bozkurt</span><br>
|
||||
CTO & Developer, Tinqs
|
||||
</div>
|
||||
</div>
|
||||
</article>
|
||||
|
||||
</body>
|
||||
</html>
|
||||
|
After
|
Reference in New Issue
Block a user