Files
blog/gpu-skinned-herds.html
T

370 lines
19 KiB
HTML

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>GPU-Skinned Herds: One Draw Call for 1,000 Animated Characters in Godot — Tinqs Blog</title>
<meta name="description" content="Godot has no built-in way to render 1,000 skinned characters in one draw call. We built a GPU skinned-instance renderer into Tinqs Engine that does — now with mat4×3 palette, far-LOD dominant-bone, and in-place bake. Pre-built binaries for macOS and Windows.">
<meta name="robots" content="index, follow">
<link rel="canonical" href="https://www.tinqs.com/blog/gpu-skinned-herds">
<meta property="og:type" content="article">
<meta property="og:url" content="https://www.tinqs.com/blog/gpu-skinned-herds">
<meta property="og:title" content="GPU-Skinned Herds: One Draw Call for 1,000 Animated Characters in Godot">
<meta property="og:description" content="One draw call, 1,000 animated characters — now with mat4×3 palette, far-LOD dominant-bone, and in-place bake. GPU-skinned herd renderer built into the Tinqs Engine fork of Godot.">
<meta property="og:image" content="https://www.tinqs.com/img/og-cover.jpg">
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="GPU-Skinned Herds: One Draw Call for 1,000 Animated Characters in Godot">
<meta name="twitter:description" content="One draw call, 1,000 animated characters — now with mat4×3 palette, far-LOD dominant-bone, and in-place bake.">
<meta name="twitter:image" content="https://www.tinqs.com/img/og-cover.jpg">
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "GPU-Skinned Herds: One Draw Call for 1,000 Animated Characters in Godot",
"datePublished": "2026-06-16",
"dateModified": "2026-06-16",
"author": {
"@type": "Person",
"name": "Ozan Bozkurt"
},
"publisher": {
"@type": "Organization",
"name": "Tinqs Limited",
"url": "https://www.tinqs.com"
},
"description": "Godot has no built-in way to render 1,000 skinned characters in one draw call. We built a GPU skinned-instance renderer into Tinqs Engine that does — 25 crocodiles verified, 1,000+ projected. Pre-built binaries for macOS and Windows."
}
</script>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Space+Grotesk:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500&family=Inter:wght@400;500;600;700&display=swap" rel="stylesheet">
<style>
/* ── Tinqs Studio brand — post styles ── */
:root {
/* Studio near-black base */
--c-bg: #0B0C0E;
--c-bg-raised: #15171A;
/* Foreground */
--c-fg: #ECEEF1;
--c-muted: #8A95A3;
/* Family accents */
--c-lime: #B6FF3C;
--c-violet: #7C5CFF;
/* Borders */
--c-border: rgba(255,255,255,.07);
--c-border-strong: rgba(255,255,255,.12);
}
*, *::before, *::after { box-sizing: border-box; }
html { background: var(--c-bg); }
body {
margin: 0;
padding: 0;
background: var(--c-bg);
color: var(--c-fg);
font-family: 'Inter', system-ui, -apple-system, sans-serif;
font-size: 16px;
line-height: 1.6;
-webkit-font-smoothing: antialiased;
}
/* ── Post container ── */
.post {
background: var(--c-bg);
max-width: 720px;
margin: 0 auto;
padding: 48px 24px 60px;
}
/* ── Back link ── */
.post__back {
color: var(--c-muted);
text-decoration: none;
font-size: 0.875rem;
display: inline-block;
margin-bottom: 24px;
transition: color 0.15s;
}
.post__back:hover { color: var(--c-lime); }
/* ── Gradient title — lime → violet ── */
.post__title {
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
background: linear-gradient(90deg, var(--c-lime), var(--c-violet));
-webkit-background-clip: text;
background-clip: text;
color: transparent;
font-weight: 700;
font-size: 2.2rem;
line-height: 1.2;
margin: 0 0 16px;
}
/* ── Date pill ── */
.post__date {
display: inline-block;
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
font-size: 0.72rem;
letter-spacing: 0.18em;
text-transform: uppercase;
color: var(--c-muted);
border: 1px solid var(--c-border);
border-radius: 999px;
padding: 4px 14px;
margin-bottom: 16px;
}
/* ── Lead ── */
.post__lead {
color: var(--c-muted);
font-size: 1.08rem;
line-height: 1.7;
}
/* ── Body ── */
.post__body { font-size: 1rem; line-height: 1.7; }
.post__body p { margin: 14px 0; }
.post__body h2 {
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
font-weight: 600;
font-size: 1.6rem;
margin: 54px 0 6px;
padding-left: 16px;
border-left: 4px solid var(--c-lime);
line-height: 1.3;
}
.post__body h3 {
font-family: 'Space Grotesk', system-ui, -apple-system, sans-serif;
font-weight: 500;
color: var(--c-violet);
font-size: 1.15rem;
margin: 30px 0 4px;
}
.post__body h4, .post__body h5, .post__body h6 {
margin: 20px 0 4px;
}
/* ── Inline code ── */
.post__body code {
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
font-size: 0.84em;
background: var(--c-bg-raised);
color: var(--c-lime);
padding: 2px 6px;
border-radius: 4px;
border: 1px solid var(--c-border);
}
/* ── Code blocks ── */
.post__body pre {
background: var(--c-bg);
border: 1px solid var(--c-border);
border-radius: 8px;
padding: 16px 18px;
overflow-x: auto;
margin: 14px 0;
font-family: 'JetBrains Mono', ui-monospace, 'SF Mono', Consolas, monospace;
font-size: 0.83rem;
line-height: 1.55;
color: var(--c-fg);
}
.post__body pre code {
background: transparent;
padding: 0;
border: none;
font-size: inherit;
color: inherit;
border-radius: 0;
}
/* ── Blockquote ── */
.post__body blockquote {
background: rgba(124, 92, 255, 0.06);
border: 1px solid rgba(124, 92, 255, 0.15);
border-left: 4px solid var(--c-violet);
border-radius: 0 8px 8px 0;
padding: 16px 18px;
margin: 18px 0;
color: var(--c-fg);
font-size: 0.94rem;
}
/* ── Links ── */
.post__body a { color: var(--c-lime); text-decoration: underline; text-underline-offset: 3px; }
.post__body a:hover { color: var(--c-violet); }
/* ── Strong ── */
.post__body strong { color: var(--c-lime); font-weight: 600; }
/* ── HR ── */
.post__body hr {
border: none;
border-top: 1px solid var(--c-border);
margin: 32px 0;
}
/* ── Figures ── */
.post__body figure { margin: 20px 0; }
.post__body figure img {
max-width: 100%;
border-radius: 12px;
border: 1px solid var(--c-border);
}
.post__body figcaption {
color: var(--c-muted);
font-size: 0.85rem;
margin-top: 6px;
}
/* ── Lists ── */
.post__body ul, .post__body ol { padding-left: 1.5em; margin: 10px 0; }
.post__body li { margin: 4px 0; }
/* ── Author ── */
.post__author {
display: flex;
align-items: center;
gap: 14px;
margin-top: 48px;
padding-top: 24px;
border-top: 1px solid var(--c-border);
}
.post__author-avatar {
width: 48px;
height: 48px;
border-radius: 50%;
background: var(--c-violet);
color: #fff;
display: flex;
align-items: center;
justify-content: center;
font-weight: 700;
font-size: 0.85rem;
flex-shrink: 0;
}
.post__author-info {
font-size: 0.85rem;
color: var(--c-muted);
line-height: 1.4;
}
.post__author-name {
color: var(--c-fg);
font-weight: 600;
}
</style>
</head>
<body>
<!-- POST -->
<article class="post">
<a href="/blog/" class="post__back">&larr; All Posts</a>
<span class="post__date">16 June 2026 · updated</span>
<h1 class="post__title">GPU-Skinned Herds: One Draw Call for 1,000 Animated Characters in Godot</h1>
<p class="post__lead">Godot gives you one <code>Skeleton3D</code> per character. Want 200 animals in a herd? That's 200 skeleton nodes, 200 draw calls, and 200 <code>AnimationPlayer</code> ticks every frame. Want 1,000? Now you're measuring in seconds per frame, not frames per second.</p>
<div class="post__body">
<p>We built a GPU skinned-instance renderer into Tinqs Engine that packs every pose into a single texture, uploads once, and draws every instance in one call. 25 crocodiles confirmed first. Then we threw 1,000 animals — 12 types mixed, random-walking — at it and the GPU didn't flinch. <strong>Now upgraded:</strong> mat4×3 palette (37% of original VRAM), far-LOD dominant-bone (3 texel fetches at distance), in-place bake (zero foot-slide), and full frustum cull. Same bone count, same animation fidelity, a tiny fraction of the cost.</p>
<h2>Why the engine needs to change</h2>
<p>The standard Godot approach — one <code>Skeleton3D</code> + one <code>MeshInstance3D</code> per character — works for a handful of animated entities. It breaks down hard at crowd scale:</p>
<ul>
<li><strong>CPU bone transforms.</strong> Computing <code>global_pose</code> for 200 skeletons × 100 bones each = 20,000 matrix multiplies per frame, all on the main thread.</li>
<li><strong>Draw call explosion.</strong> Each <code>MeshInstance3D</code> is its own draw call. Even with MultiMesh, there's no built-in path for skinned meshes — <code>MultiMeshInstance3D</code> only handles static geometry.</li>
<li><strong>AnimationPlayer sprawl.</strong> Each skeleton needs its own <code>AnimationPlayer</code> and its own <code>process()</code> tick.</li>
</ul>
<p>The alternative — baking animations to vertex textures — works for static crowds but locks you out of per-instance variation. No blending, no phase offsets, no reactive behaviour.</p>
<p>What we need is simpler: <strong>share the skeleton, drive per-instance poses from a single animation, batch the draw call.</strong> That's what <code>agent_skinned</code> does.</p>
<h2>How it works: two classes, one texture</h2>
<p>The module lives in <code>modules/agent_skinned/</code> inside <a href="https://tinqs.com/tinqs/engine" style="color: var(--c-lime);">Tinqs Engine</a>. Two classes, one job:</p>
<h3><code>MultiSkinnedMeshInstance3D</code> — the data plane</h3>
<p>Holds the CPU-side bone matrices. Allocates an <code>ImageTexture</code> of size <code>[4 × max_bones, max_instances]</code> in RGBA32F — each texel is one column of a 4×4 bone matrix. For a 130-bone crocodile with 256 instances:</p>
<pre><code>Texture: 520 × 256 RGBA32F ≈ 2 MB</code></pre>
<p>That's the entire pose state for 256 animated crocodiles in a single GPU texture. The API is simple:</p>
<pre><code class="language-gdscript">var data := MultiSkinnedMeshInstance3D.new()
data.set_mesh(crocodile_mesh)
data.set_skeleton(skeleton) # rest pose + bone hierarchy
data.set_max_instances(256)
data.set_max_bones(130)
# Each frame: push poses from the animated skeleton
for instance in herd_positions:
data.set_instance_pose_bones(instance.id, bone_transforms)
data.update() # upload only dirty instances, not the whole texture</code></pre>
<h3><code>MultiSkinnedInstance3D</code> — the renderer</h3>
<p>A <code>MultiMeshInstance3D</code> subclass. Set its multimesh with the skinned mesh and instance transforms, point it at the data plane, call <code>refresh()</code> — it uploads the bone texture into the shader material's <code>bone_matrices_tex</code> uniform and the mesh is drawn in one call.</p>
<p>The shader does 4-bone linear-blend skinning on the GPU:</p>
<pre><code class="language-glsl">mat4 get_bone(int b) {
return mat4(
texelFetch(bone_matrices_tex, ivec2(b * 4 + 0, INSTANCE_ID), 0),
texelFetch(bone_matrices_tex, ivec2(b * 4 + 1, INSTANCE_ID), 0),
texelFetch(bone_matrices_tex, ivec2(b * 4 + 2, INSTANCE_ID), 0),
texelFetch(bone_matrices_tex, ivec2(b * 4 + 3, INSTANCE_ID), 0)
);
}</code></pre>
<p><code>INSTANCE_ID</code> is a Godot built-in — the GPU already knows which instance it's rendering. We just use it to index into the bone texture. No uniform arrays, no SSBOs, no compute shaders. Just a 2D texture and a custom vertex shader.</p>
<h2>Two bugs we shipped and fixed</h2>
<p>The module had data-plane doctests from day one — round-trip pose get/set, dirty tracking, size clamping, AABB. All green. Then we put it on screen for the first time and the crocodiles looked... wrong.</p>
<p><strong>Bug 1: Shader compile failure.</strong> The default skinning shader compared <code>TANGENT</code> as <code>vec4</code>. Godot 4 exposes it as <code>vec3</code>. Fixed in one line, added <code>albedo_tex</code> uniform so herds texture out of the box.</p>
<p><strong>Bug 2: Bone matrices stored transposed.</strong> The data plane wrote basis rows (standard Godot <code>Transform3D.basis</code> is row-major), but the shader unpacked as columns. Every bone matrix was transposed — the mesh crumpled. Not a scale bug, not an orientation bug — a layout mismatch. Fixed by storing column-major, with a doctest to prevent regression.</p>
<p>The lesson: doctests catch logic. Rendering catches truth. You need both.</p>
<h2>What's driving it</h2>
<p>In <a href="https://www.arikigame.com" style="color: var(--c-lime);">Ariki</a>, the sim tracks animal migration across a 12km archipelago. <code>AnimalHerdRenderer.cs</code> groups sim <code>ViewerState.animals</code> by type, feeds positions to <code>skinned_herd.gd</code> (a reusable per-type herd backend), which drives the renderer. One <code>AnimationPlayer</code> animates a single driver skeleton; poses propagate to every instance.</p>
<p>The crocodile herd scene was 25 instances, one draw call. The perf test scene does 1,000 animals across 12 types — Boar, Cow, Crab, Crocodile, Deer, Fish, Goat, Hen, Pig, Rabbit, Sheep, Tiger — each type its own GPU herd, all mixed, all random-walking, FPS holding steady.</p>
<h2>What's deliberately not here</h2>
<ul>
<li><strong>No C# wrapper.</strong> Instantiate from GDScript via <code>ClassDB.instantiate()</code> — the binding surface is small and stable.</li>
<li><strong>No automatic <code>AnimationPlayer</code> integration.</strong> You drive poses. We give you the texture. Freedom to animate however you want.</li>
<li><strong>No GPU occlusion or LOD.</strong> That's the game's job. The engine provides the tool; the game decides what to draw.</li>
</ul>
<h2>What's new in this build (16 June 2026)</h2>
<ul>
<li><strong>mat4x3 palette (B4).</strong> Each bone packs into 3 RGBA16F texels instead of 4 — 37% of the original VRAM and texel fetch cost. Column-major, doctest-guarded.</li>
<li><strong>Far-LOD dominant-bone.</strong> At distance, each instance uses a single nearest-frame bone (~3 texel fetches vs ~24 near). LOD thresholds per-animal, scaled by body size — giraffes stay crisp 3x farther than rats.</li>
<li><strong>In-place bake.</strong> Walk/run clips no longer translate root motion — the bake strips horizontal drift so the sim owns position. Fixed the notorious slide/skate bug across all animal types.</li>
<li><strong>Full frustum cull (C7).</strong> Only on-screen instances hit the GPU. Caught a sign bug where Godot's outward-pointing frustum normals inverted the cull test.</li>
<li><strong>Bulk instance upload (A1).</strong> One <code>MultiMesh.buffer =</code> per herd per frame — zero per-instance native calls.</li>
</ul>
<p>24 doctests green. Visual-verified on Kraken (M1/Metal) and Forge (Windows/RTX).</p>
<h2>Get the build</h2>
<p>Pre-built editor binaries with <code>agent_skinned</code> baked in — no engine compile required. The game's <code>animal_perf_test.tscn</code> lets you toggle 10 / 100 / 1000 animals and read live FPS:</p>
<table style="border-collapse:collapse;width:100%;margin:12px 0;">
<tr style="border-bottom:1px solid var(--c-border);"><th style="text-align:left;padding:8px;color:var(--c-muted);">Platform</th><th style="text-align:left;padding:8px;color:var(--c-muted);">Binary</th><th style="text-align:left;padding:8px;color:var(--c-muted);">Engine commit</th></tr>
<tr style="border-bottom:1px solid var(--c-border);"><td style="padding:8px;"><strong>macOS ARM64</strong></td><td style="padding:8px;"><a href="https://tinqs.com/tinqs/builds/media/branch/main/engine/macos-arm64/tinqs.macos.editor.arm64.mono" style="color: var(--c-lime);"><code>tinqs.macos.editor.arm64.mono</code></a></td><td style="padding:8px;"><code>4fe1323</code> (4.6.4, Xcode 26.3)</td></tr>
<tr style="border-bottom:1px solid var(--c-border);"><td style="padding:8px;"><strong>Windows x64</strong></td><td style="padding:8px;"><a href="https://tinqs.com/tinqs/builds/media/branch/main/engine/windows-x64/tinqs.windows.editor.x86_64.mono.exe" style="color: var(--c-lime);"><code>tinqs.windows.editor.x86_64.mono.exe</code></a></td><td style="padding:8px;"><code>420e74bf</code> (4.6.5, MSVC 2022) &#x1F195;</td></tr>
</table>
<p>All builds live in the public <a href="https://tinqs.com/tinqs/builds" style="color: var(--c-lime);"><code>tinqs/builds</code></a> repo — engine source is private, but the binaries are yours. See <a href="https://tinqs.com/tinqs/builds/src/branch/main/manifest.json" style="color: var(--c-lime);"><code>manifest.json</code></a> for checksums and build details.</p>
<p>The engine source lives in <a href="https://tinqs.com/tinqs/engine" style="color: var(--c-lime);"><code>tinqs/engine</code></a> (private). Module docs: <code>modules/agent_skinned/README.md</code> and <code>.agents/wiki/agent-skinned-gpu-herd.md</code>.</p>
<hr>
<p><strong>Related:</strong> <a href="fork-dont-build" style="color: var(--c-lime);">Fork, Don't Build</a> — why we modify existing platforms instead of building new ones. <a href="godot-optimisation" style="color: var(--c-lime);">Streaming a 12km Archipelago in Godot 4</a> — the terrain and vegetation streaming layers that work alongside this.</p>
</div>
<div class="post__author">
<div class="post__author-avatar">OB</div>
<div class="post__author-info">
<span class="post__author-name">Ozan Bozkurt</span><br>
CTO & Developer, Tinqs
</div>
</div>
</article>
</body>
</html>