393 lines
20 KiB
HTML
393 lines
20 KiB
HTML
<!DOCTYPE html>
|
||
<html lang="en">
|
||
<head>
|
||
<meta charset="UTF-8">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||
|
||
<title>Zero-CPU Crowd Animation: How We Made 1,000 Animals Animate Without a Single Skeleton — Tinqs Blog</title>
|
||
<meta name="description" content="Yesterday we shipped a GPU herd renderer that used one live skeleton per animal state. Today we ripped out every live skeleton and made the GPU drive all animation itself — 1,000 agents at 60 FPS, zero per-frame CPU cost, each with its own clip, speed, and phase.">
|
||
<meta name="robots" content="index, follow">
|
||
<link rel="canonical" href="https://www.tinqs.com/blog/gpu-driven-crowd-animation">
|
||
|
||
<meta property="og:type" content="article">
|
||
<meta property="og:url" content="https://www.tinqs.com/blog/gpu-driven-crowd-animation">
|
||
<meta property="og:title" content="Zero-CPU Crowd Animation: How We Made 1,000 Animals Animate Without a Single Skeleton">
|
||
<meta property="og:description" content="1,000 animated agents, zero live skeletons, zero per-frame CPU. Our GPU-driven crowd animation platform in the Tinqs Engine fork.">
|
||
<meta property="og:image" content="https://www.tinqs.com/img/og-cover.jpg">
|
||
|
||
<meta name="twitter:card" content="summary_large_image">
|
||
<meta name="twitter:title" content="Zero-CPU Crowd Animation: How We Made 1,000 Animals Animate Without a Single Skeleton">
|
||
<meta name="twitter:description" content="1,000 animated agents, zero live skeletons, zero per-frame CPU. Our GPU-driven crowd animation platform in the Tinqs Engine fork.">
|
||
<meta name="twitter:image" content="https://www.tinqs.com/img/og-cover.jpg">
|
||
|
||
<script type="application/ld+json">
|
||
{
|
||
"@context": "https://schema.org",
|
||
"@type": "BlogPosting",
|
||
"headline": "Zero-CPU Crowd Animation: How We Made 1,000 Animals Animate Without a Single Skeleton",
|
||
"datePublished": "2026-06-15",
|
||
"author": {
|
||
"@type": "Person",
|
||
"name": "Ozan Bozkurt"
|
||
},
|
||
"publisher": {
|
||
"@type": "Organization",
|
||
"name": "Tinqs Limited",
|
||
"url": "https://www.tinqs.com"
|
||
},
|
||
"description": "Yesterday we shipped a GPU herd renderer that used one live skeleton per animal state. Today we ripped out every live skeleton and made the GPU drive all animation itself — 1,000 agents at 60 FPS, zero per-frame CPU cost, each with its own clip, speed, and phase."
|
||
}
|
||
</script>
|
||
|
||
<link rel="preconnect" href="https://fonts.googleapis.com">
|
||
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
||
<link href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500&family=Inter:wght@400;500;600;700&display=swap" rel="stylesheet">
|
||
|
||
<style>
|
||
/* ── Tinqs Studio brand — post styles ── */
|
||
|
||
:root {
|
||
/* Studio near-black base */
|
||
--c-bg: #0B0C0E;
|
||
--c-bg-raised: #15171A;
|
||
/* Foreground */
|
||
--c-fg: #ECEEF1;
|
||
--c-muted: #8A95A3;
|
||
/* Family accents */
|
||
--c-lime: #B6FF3C;
|
||
--c-violet: #7C5CFF;
|
||
/* Borders */
|
||
--c-border: rgba(255,255,255,.07);
|
||
--c-border-strong: rgba(255,255,255,.12);
|
||
}
|
||
|
||
*, *::before, *::after { box-sizing: border-box; }
|
||
|
||
html { background: var(--c-bg); }
|
||
|
||
body {
|
||
margin: 0;
|
||
padding: 0;
|
||
background: var(--c-bg);
|
||
color: var(--c-fg);
|
||
font-family: 'Inter', system-ui, -apple-system, sans-serif;
|
||
font-size: 16px;
|
||
line-height: 1.6;
|
||
-webkit-font-smoothing: antialiased;
|
||
}
|
||
|
||
/* ── Post container ── */
|
||
.post {
|
||
background: var(--c-bg);
|
||
max-width: 720px;
|
||
margin: 0 auto;
|
||
padding: 48px 24px 60px;
|
||
}
|
||
|
||
/* ── Back link ── */
|
||
.post__back {
|
||
color: var(--c-muted);
|
||
text-decoration: none;
|
||
font-size: 0.875rem;
|
||
display: inline-block;
|
||
margin-bottom: 24px;
|
||
transition: color 0.15s;
|
||
}
|
||
.post__back:hover { color: var(--c-lime); }
|
||
|
||
/* ── Gradient title — lime → violet ── */
|
||
.post__title {
|
||
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
|
||
background: linear-gradient(90deg, var(--c-lime), var(--c-violet));
|
||
-webkit-background-clip: text;
|
||
background-clip: text;
|
||
color: transparent;
|
||
font-weight: 700;
|
||
font-size: 2.2rem;
|
||
line-height: 1.2;
|
||
margin: 0 0 16px;
|
||
}
|
||
|
||
/* ── Date pill ── */
|
||
.post__date {
|
||
display: inline-block;
|
||
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
|
||
font-size: 0.72rem;
|
||
letter-spacing: 0.18em;
|
||
text-transform: uppercase;
|
||
color: var(--c-muted);
|
||
border: 1px solid var(--c-border);
|
||
border-radius: 999px;
|
||
padding: 4px 14px;
|
||
margin-bottom: 16px;
|
||
}
|
||
|
||
/* ── Lead ── */
|
||
.post__lead {
|
||
color: var(--c-muted);
|
||
font-size: 1.08rem;
|
||
line-height: 1.7;
|
||
}
|
||
|
||
/* ── Body ── */
|
||
.post__body { font-size: 1rem; line-height: 1.7; }
|
||
|
||
.post__body p { margin: 14px 0; }
|
||
|
||
.post__body h2 {
|
||
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
|
||
font-weight: 600;
|
||
font-size: 1.6rem;
|
||
margin: 54px 0 6px;
|
||
padding-left: 16px;
|
||
border-left: 4px solid var(--c-lime);
|
||
line-height: 1.3;
|
||
}
|
||
|
||
.post__body h3 {
|
||
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
|
||
font-weight: 500;
|
||
color: var(--c-violet);
|
||
font-size: 1.15rem;
|
||
margin: 30px 0 4px;
|
||
}
|
||
|
||
.post__body h4, .post__body h5, .post__body h6 {
|
||
margin: 20px 0 4px;
|
||
}
|
||
|
||
/* ── Inline code ── */
|
||
.post__body code {
|
||
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
|
||
font-size: 0.84em;
|
||
background: var(--c-bg-raised);
|
||
color: var(--c-lime);
|
||
padding: 2px 6px;
|
||
border-radius: 4px;
|
||
border: 1px solid var(--c-border);
|
||
}
|
||
|
||
/* ── Code blocks ── */
|
||
.post__body pre {
|
||
background: var(--c-bg);
|
||
border: 1px solid var(--c-border);
|
||
border-radius: 8px;
|
||
padding: 16px 18px;
|
||
overflow-x: auto;
|
||
margin: 14px 0;
|
||
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
|
||
font-size: 0.83rem;
|
||
line-height: 1.55;
|
||
color: var(--c-fg);
|
||
}
|
||
|
||
.post__body pre code {
|
||
background: transparent;
|
||
padding: 0;
|
||
border: none;
|
||
font-size: inherit;
|
||
color: inherit;
|
||
border-radius: 0;
|
||
}
|
||
|
||
/* ── Blockquote ── */
|
||
.post__body blockquote {
|
||
background: rgba(124, 92, 255, 0.06);
|
||
border: 1px solid rgba(124, 92, 255, 0.15);
|
||
border-left: 4px solid var(--c-violet);
|
||
border-radius: 0 8px 8px 0;
|
||
padding: 16px 18px;
|
||
margin: 18px 0;
|
||
color: var(--c-fg);
|
||
font-size: 0.94rem;
|
||
}
|
||
|
||
/* ── Links ── */
|
||
.post__body a { color: var(--c-lime); text-decoration: underline; text-underline-offset: 3px; }
|
||
.post__body a:hover { color: var(--c-violet); }
|
||
|
||
/* ── Strong ── */
|
||
.post__body strong { color: var(--c-lime); font-weight: 600; }
|
||
|
||
/* ── HR ── */
|
||
.post__body hr {
|
||
border: none;
|
||
border-top: 1px solid var(--c-border);
|
||
margin: 32px 0;
|
||
}
|
||
|
||
/* ── Figures ── */
|
||
.post__body figure { margin: 20px 0; }
|
||
.post__body figure img {
|
||
max-width: 100%;
|
||
border-radius: 12px;
|
||
border: 1px solid var(--c-border);
|
||
}
|
||
|
||
.post__body figcaption {
|
||
color: var(--c-muted);
|
||
font-size: 0.85rem;
|
||
margin-top: 6px;
|
||
}
|
||
|
||
/* ── Lists ── */
|
||
.post__body ul, .post__body ol { padding-left: 1.5em; margin: 10px 0; }
|
||
.post__body li { margin: 4px 0; }
|
||
|
||
/* ── Author ── */
|
||
.post__author {
|
||
display: flex;
|
||
align-items: center;
|
||
gap: 14px;
|
||
margin-top: 48px;
|
||
padding-top: 24px;
|
||
border-top: 1px solid var(--c-border);
|
||
}
|
||
|
||
.post__author-avatar {
|
||
width: 48px;
|
||
height: 48px;
|
||
border-radius: 50%;
|
||
background: var(--c-violet);
|
||
color: #fff;
|
||
display: flex;
|
||
align-items: center;
|
||
justify-content: center;
|
||
font-weight: 700;
|
||
font-size: 0.85rem;
|
||
flex-shrink: 0;
|
||
}
|
||
|
||
.post__author-info {
|
||
font-size: 0.85rem;
|
||
color: var(--c-muted);
|
||
line-height: 1.4;
|
||
}
|
||
|
||
.post__author-name {
|
||
color: var(--c-fg);
|
||
font-weight: 600;
|
||
}
|
||
</style>
|
||
</head>
|
||
<body>
|
||
|
||
<!-- POST -->
|
||
<article class="post">
|
||
<a href="/blog/" class="post__back">← All Posts</a>
|
||
<span class="post__date">15 June 2026</span>
|
||
<h1 class="post__title">Zero-CPU Crowd Animation: How We Made 1,000 Animals Animate Without a Single Skeleton</h1>
|
||
<p class="post__lead">Yesterday we <a href="gpu-skinned-herds" style="color: var(--c-lime);">shipped a GPU herd renderer</a> that draws 1,000 skinned animals in a handful of draw calls. It worked — 25 crocodiles confirmed, 1,000 animals projected. But it had a quiet cost: <strong>one live skeleton per animal state per type.</strong> For 30 types with 5 states each, that's 150 <code>Skeleton3D</code> nodes — each with an <code>AnimationPlayer</code>, each pushing bone matrices to the GPU every frame. The GPU was fast, but the CPU was doing real work.</p>
|
||
|
||
<div class="post__body">
|
||
<p>Today we ripped out every live skeleton. The CPU now does <strong>zero per-frame animation work.</strong> 1,000 animals at 60 FPS. Each plays its own clip at its own speed and phase — no lockstep, no copy-paste poses. Here's how.</p>
|
||
<h2>The problem: lockstep costs CPU</h2>
|
||
<p>The original <code>agent_skinned</code> module worked by <strong>sharing a live skeleton.</strong> One driver <code>Skeleton3D</code> animated, and its pose was pushed to every instance in the herd. For variation across states (walking vs idle vs attacking), you needed one herd per state — each with its own driver skeleton.</p>
|
||
<pre><code>30 animal types × 5 states = 150 live skeletons on the CPU</code></pre>
|
||
<p>Each skeleton: compute <code>global_pose</code> for every bone, run an <code>AnimationPlayer.process()</code>, push matrices into the data plane, upload the dirty texture region. The cost tracked <strong>herd count</strong>, not instance count. At 1,000 animals: ~25 FPS. At 10,000: the system crumbles.</p>
|
||
<p>The fix sounds obvious in retrospect: <strong>the GPU should compute the poses, not the CPU.</strong> Bake every animation frame into a texture once, and let each instance's vertex shader figure out which frame to sample.</p>
|
||
<h2>The bake: one texture per character type, done once</h2>
|
||
<p>At load time, the <code>skinned_herd.gd</code> backend plays every animation clip on a temporary <code>Skeleton3D</code> and records the bone matrices for every frame into the data plane. A Goat with 9 clips at 30 fps produces 496 frames. Each frame is one row in the bone-matrix texture:</p>
|
||
<pre><code>Goat: 53 bones × 496 frames = 26,288 bone matrices
|
||
Texture: 212 × 496 pixels, RGBA32F
|
||
VRAM: 212 × 496 × 16 bytes = 1.6 MB</code></pre>
|
||
<p>That's the ENTIRE animation data for a Goat — walk, run, idle, attack, death, eat, sleep — every frame of every clip, in 1.6 MB. The bake takes a few milliseconds. After that, the skeleton is destroyed. It never runs again.</p>
|
||
<p>For 30 animal types: ~48 MB total. Compare this to vertex animation textures (VAT): the same Goat would need 2,500 vertices × 496 frames × 12 bytes = <strong>14.2 MB per type, 426 MB total.</strong> Bone-matrix is 9× smaller because bones ≪ vertices.</p>
|
||
<h2>The GPU: per-instance playback, zero CPU</h2>
|
||
<p>Each MultiMesh instance carries 4 numbers in <code>INSTANCE_CUSTOM</code>:</p>
|
||
<p>| Channel | Meaning |</p>
|
||
<p>|———|———|</p>
|
||
<p>| <code>.x</code> | Which clip (start row in the palette) |</p>
|
||
<p>| <code>.y</code> | How many frames in this clip |</p>
|
||
<p>| <code>.z</code> | Playback rate (baked-fps × ground speed) |</p>
|
||
<p>| <code>.w</code> | Phase offset (0..1, golden-ratio spread) |</p>
|
||
<p>The vertex shader derives each instance's current frame from TIME:</p>
|
||
<pre><code class="language-glsl">float fcount = max(INSTANCE_CUSTOM.y, 1.0);
|
||
int start = int(INSTANCE_CUSTOM.x + 0.5);
|
||
float fpos = mod(TIME * INSTANCE_CUSTOM.z + INSTANCE_CUSTOM.w * fcount, fcount);
|
||
|
||
int f0 = int(fpos);
|
||
int f1 = int(mod(float(f0) + 1.0, fcount));
|
||
float fr = fpos - float(f0);
|
||
|
||
// Blend between two adjacent baked frames for smooth playback at low bake fps
|
||
int r0 = start + f0;
|
||
int r1 = start + f1;
|
||
|
||
mat4 m0 = mat4(
|
||
texelFetch(bone_matrices_tex, ivec2(px+0, r0), 0),
|
||
texelFetch(bone_matrices_tex, ivec2(px+1, r0), 0),
|
||
texelFetch(bone_matrices_tex, ivec2(px+2, r0), 0),
|
||
texelFetch(bone_matrices_tex, ivec2(px+3, r0), 0));
|
||
mat4 m1 = mat4( /* same for r1 */ );
|
||
|
||
skin += (m0 * (1.0 - fr) + m1 * fr) * weight;</code></pre>
|
||
<p>That's it. The CPU does nothing per frame. No skeletons. No <code>AnimationPlayer</code>. No per-instance push. Every instance computes its own frame from TIME + its custom data. A walking Boar, a running Boar, and an idle Boar all share the same baked palette — they just point at different rows.</p>
|
||
<h2>What changed in the engine</h2>
|
||
<p>The shader needed one critical change: the bone-matrix texture went from being indexed by <code>INSTANCE_ID</code> (one row per instance) to being indexed by a <strong>pose slot</strong> computed from <code>INSTANCE_CUSTOM</code> (one row per baked frame). The old code:</p>
|
||
<pre><code class="language-glsl">int inst = INSTANCE_ID; // row = instance index → lockstep</code></pre>
|
||
<p>Became:</p>
|
||
<pre><code class="language-glsl">int r0 = start + f0; // row = palette row from clip + frame → per-instance variety</code></pre>
|
||
<p>This is a 40-line shader change in the engine's <code>multi_skinned_instance_3d.cpp</code>. It's backward-compatible — slot 0 still works for the old lockstep path (which airborne bird flocks use intentionally — synchronized flapping is a feature, not a bug).</p>
|
||
<p>Engine version bumped from 4.6.4 to <strong>4.6.5</strong>.</p>
|
||
<h2>The numbers (measured, not projected)</h2>
|
||
<p>On an M1 Pro MacBook Pro (integrated GPU):</p>
|
||
<p>| Agent count | Old lockstep (4.6.4) | GPU-driven palette (4.6.5) |</p>
|
||
<p>|————|———————-|—————————-|</p>
|
||
<p>| 100 | ~40 FPS | <strong>60 FPS</strong> |</p>
|
||
<p>| 500 | 31–39 FPS | <strong>60 FPS</strong> |</p>
|
||
<p>| 1,000 | ~25 FPS | <strong>60 FPS</strong> |</p>
|
||
<p>| 10,000 | untested | 8 FPS (unoptimized) |</p>
|
||
<p>The 10,000 number is low because we haven't done the one-herd-per-type optimization yet — 292 herds vs the planned ~30. And our distance culling still runs on the CPU (MultiMesh has no built-in culling). Both are in the roadmap.</p>
|
||
<p><strong>VRAM:</strong> 1.6 MB per animal type. 30 types = 48 MB total. A Steam Deck with 1 GB shared memory handles this comfortably. The VAT alternative would need 426 MB — nine times more.</p>
|
||
<p><strong>Draw calls:</strong> Currently ~158 (one per type × state, the lockstep holdover). After collapsing to one herd per type: ~30. After sharing palettes for rig-reuse animals: even fewer.</p>
|
||
<h2>The bug that made everything invisible</h2>
|
||
<p>The first build rendered nothing. Animals were "visible" (instance count correct), custom data correct, shader compiled, texture valid — but the screen was empty. FPS was 60 because it was drawing nothing.</p>
|
||
<p>Root cause: a <code>renderer.refresh()</code> call during setup raced the renderer's own <code>NOTIFICATION_READY</code> handler, which re-bound the shader's <code>bone_matrices_tex</code> uniform — overwriting our baked texture with an unbound (default white) one. The shader sampled white → every bone matrix was identity → the mesh collapsed to a point at origin → invisible.</p>
|
||
<p>Fix: bind the texture once on the <strong>first <code>_process</code> frame</strong>, after all nodes have had their <code>_ready</code> called. Then never touch it again. One deferred bind, zero per-frame cost. This is a classic Godot <code>_ready</code> sequencing gotcha.</p>
|
||
<h2>Where this puts us vs AAA</h2>
|
||
<p>The technique — baking bone matrices into a texture and letting the GPU drive per-instance animation — is the same architecture used by Assassin's Creed Unity, Total War: Warhammer, and Hitman for their crowd systems. We're using the same core idea, in a Godot fork, targeting a fraction of the VRAM.</p>
|
||
<p>What AAA has that we don't (yet):</p>
|
||
<ul>
|
||
<li><strong>LOD tiers</strong> — far agents become 2D impostors (billboard quads with a sprite atlas). Same <code>(clip, frame, speed, phase)</code> packet drives all tiers.</li>
|
||
<li><strong>Hero rigs</strong> — the nearest few agents get real <code>Skeleton3D</code> + <code>AnimationTree</code> + IK + ragdoll. Smooth gait blends, foot-lock, look-at.</li>
|
||
<li><strong>Offline bake pipeline</strong> — precompute palettes in the asset build, not at load time.</li>
|
||
<li><strong>GPU compute culling</strong> — frustum + distance + LOD classification on the GPU, no CPU cull loop.</li>
|
||
</ul>
|
||
<p>These are planned and designed (the platform doc is at <code>ariki-sim/wiki/plans/crowd-animation-platform-2026-06-15.md</code>), but not built yet. The foundation — the GPU-driven baked palette — is what makes all of them possible.</p>
|
||
<h2>The fork question</h2>
|
||
<p>Every time we change the engine, someone asks: "couldn't you do this without a fork?" For this feature, the answer is no — not without significant compromises. The alternatives:</p>
|
||
<ul>
|
||
<li><strong>VAT (vertex animation textures) with a Godot plugin:</strong> Works in stock Godot, but VRAM is 9× larger. For 30 animal types: 426 MB vs our 48 MB. For 5 colonist looks: 620 MB — doesn't fit on a Steam Deck. VAT also can't blend frames (hard cuts between baked frames, no smooth playback) and can't skin normals/tangents (incorrect lighting).</li>
|
||
</ul>
|
||
<ul>
|
||
<li><strong>Phase-offset drivers only:</strong> Keep the live skeletons but stagger their phases. Gives some variety, but still has N live skeletons on the CPU. Doesn't scale to thousands of colonists.</li>
|
||
</ul>
|
||
<ul>
|
||
<li><strong>Don't do crowds:</strong> The simplest answer. But Ariki needs animals and colonists. The architecture decision was made: we forked Godot to own the renderer, and this is exactly the kind of feature that justifies the fork.</li>
|
||
</ul>
|
||
<h2>What's next</h2>
|
||
<p>The 4-item immediate roadmap:</p>
|
||
<p>1. <strong>One herd per type</strong> — collapse ~158 herds to ~30 (remove the per-state batching from the lockstep era)</p>
|
||
<p>2. <strong>Distance LOD</strong> — CPU-side cull + cheaper-far shader for far instances</p>
|
||
<p>3. <strong>RGBA16F + offline bake</strong> — half the VRAM, zero load-time hitch</p>
|
||
<p>4. <strong>Hero rigs</strong> — real <code>AnimationTree</code> + IK + ragdoll for the nearest few animals</p>
|
||
<p>The far horizon: animated 2D impostors and GPU compute-cull, designed and parked. Brought forward when the load demands them.</p>
|
||
<p>The engine source lives in <a href="https://tinqs.com/tinqs/engine" style="color: var(--c-lime);"><code>tinqs/engine</code></a> (private). Pre-built editor binaries at <a href="https://tinqs.com/tinqs/builds" style="color: var(--c-lime);"><code>tinqs/builds</code></a>. The Ariki game is at <a href="https://www.arikigame.com" style="color: var(--c-lime);">arikigame.com</a>.</p>
|
||
<hr>
|
||
<p><strong>Related:</strong> <a href="gpu-skinned-herds" style="color: var(--c-lime);">GPU-Skinned Herds</a> — the original herd renderer (yesterday's post). <a href="fork-dont-build" style="color: var(--c-lime);">Fork, Don't Build</a> — why we modify existing platforms instead of building new ones. <a href="godot-optimisation" style="color: var(--c-lime);">Streaming a 12km Archipelago in Godot 4</a> — the terrain and vegetation layers that work alongside this.</p>
|
||
|
||
</div>
|
||
|
||
<div class="post__author">
|
||
<div class="post__author-avatar">OB</div>
|
||
<div class="post__author-info">
|
||
<span class="post__author-name">Ozan Bozkurt</span><br>
|
||
CTO & Developer, Tinqs
|
||
</div>
|
||
</div>
|
||
</article>
|
||
|
||
</body>
|
||
</html>
|