--- title: "GPU-Skinned Herds: One Draw Call for 1,000 Animated Characters in Godot" slug: gpu-skinned-herds date: "2026-06-14" description: "Godot has no built-in way to render 1,000 skinned characters in one draw call. We built a GPU skinned-instance renderer into Tinqs Engine that does — 25 crocodiles verified, 1,000+ projected. Pre-built binaries for macOS and Windows." og_description: "One draw call, 1,000 animated characters. GPU-skinned herd renderer built into the Tinqs Engine fork of Godot." og_image: "https://www.tinqs.com/img/og-cover.jpg" excerpt: "Godot can't batch-render 1,000 animated characters. We built a GPU skinned-instance herd renderer into the engine itself — already driving crocodile herds in Ariki. Pre-built editor binaries for macOS and Windows." author: "Ozan Bozkurt" author_initials: "OB" author_role: "CTO & Developer, Tinqs" --- Godot gives you one `Skeleton3D` per character. Want 200 animals in a herd? That's 200 skeleton nodes, 200 draw calls, and 200 `AnimationPlayer` ticks every frame. Want 1,000? Now you're measuring in seconds per frame, not frames per second. We built a GPU skinned-instance renderer into Tinqs Engine that packs every pose into a single texture, uploads once, and draws every instance in one call. 25 crocodiles on screen right now. 1,000+ projected. Same bone count, same animation fidelity — a tiny fraction of the cost. ## Why the engine needs to change The standard Godot approach — one `Skeleton3D` + one `MeshInstance3D` per character — works for a handful of animated entities. It breaks down hard at crowd scale: - **CPU bone transforms.** Computing `global_pose` for 200 skeletons × 100 bones each = 20,000 matrix multiplies per frame, all on the main thread. - **Draw call explosion.** Each `MeshInstance3D` is its own draw call. Even with MultiMesh, there's no built-in path for skinned meshes — `MultiMeshInstance3D` only handles static geometry. - **AnimationPlayer sprawl.** Each skeleton needs its own `AnimationPlayer` and its own `process()` tick. The alternative — baking animations to vertex textures — works for static crowds but locks you out of per-instance variation. No blending, no phase offsets, no reactive behaviour. What we need is simpler: **share the skeleton, drive per-instance poses from a single animation, batch the draw call.** That's what `agent_skinned` does. ## How it works: two classes, one texture The module lives in `modules/agent_skinned/` inside [Tinqs Engine](https://tinqs.com/tinqs/engine). Two classes, one job: ### `MultiSkinnedMeshInstance3D` — the data plane Holds the CPU-side bone matrices. Allocates an `ImageTexture` of size `[4 × max_bones, max_instances]` in RGBA32F — each texel is one column of a 4×4 bone matrix. For a 130-bone crocodile with 256 instances: ``` Texture: 520 × 256 RGBA32F ≈ 2 MB ``` That's the entire pose state for 256 animated crocodiles in a single GPU texture. The API is simple: ```gdscript var data := MultiSkinnedMeshInstance3D.new() data.set_mesh(crocodile_mesh) data.set_skeleton(skeleton) # rest pose + bone hierarchy data.set_max_instances(256) data.set_max_bones(130) # Each frame: push poses from the animated skeleton for instance in herd_positions: data.set_instance_pose_bones(instance.id, bone_transforms) data.update() # upload only dirty instances, not the whole texture ``` ### `MultiSkinnedInstance3D` — the renderer A `MultiMeshInstance3D` subclass. Set its multimesh with the skinned mesh and instance transforms, point it at the data plane, call `refresh()` — it uploads the bone texture into the shader material's `bone_matrices_tex` uniform and the mesh is drawn in one call. The shader does 4-bone linear-blend skinning on the GPU: ```glsl mat4 get_bone(int b) { return mat4( texelFetch(bone_matrices_tex, ivec2(b * 4 + 0, INSTANCE_ID), 0), texelFetch(bone_matrices_tex, ivec2(b * 4 + 1, INSTANCE_ID), 0), texelFetch(bone_matrices_tex, ivec2(b * 4 + 2, INSTANCE_ID), 0), texelFetch(bone_matrices_tex, ivec2(b * 4 + 3, INSTANCE_ID), 0) ); } ``` `INSTANCE_ID` is a Godot built-in — the GPU already knows which instance it's rendering. We just use it to index into the bone texture. No uniform arrays, no SSBOs, no compute shaders. Just a 2D texture and a custom vertex shader. ## Two bugs we shipped and fixed The module had data-plane doctests from day one — round-trip pose get/set, dirty tracking, size clamping, AABB. All green. Then we put it on screen for the first time and the crocodiles looked... wrong. **Bug 1: Shader compile failure.** The default skinning shader compared `TANGENT` as `vec4`. Godot 4 exposes it as `vec3`. Fixed in one line, added `albedo_tex` uniform so herds texture out of the box. **Bug 2: Bone matrices stored transposed.** The data plane wrote basis rows (standard Godot `Transform3D.basis` is row-major), but the shader unpacked as columns. Every bone matrix was transposed — the mesh crumpled. Not a scale bug, not an orientation bug — a layout mismatch. Fixed by storing column-major, with a doctest to prevent regression. The lesson: doctests catch logic. Rendering catches truth. You need both. ## What's driving it In [Ariki](https://www.arikigame.com), the sim tracks animal migration across a 12km archipelago. `AnimalHerdRenderer.cs` groups sim `ViewerState.animals` by type, feeds positions to `skinned_herd.gd` (a reusable per-type herd backend), which drives the renderer. One `AnimationPlayer` animates a single driver skeleton; poses propagate to every instance. The crocodile herd scene is 25 instances, one draw call. The same pipeline projects to 200–1,000 before the GPU budget even notices. ## What's deliberately not here - **No C# wrapper.** Instantiate from GDScript via `ClassDB.instantiate()` — the binding surface is small and stable. - **No automatic `AnimationPlayer` integration.** You drive poses. We give you the texture. Freedom to animate however you want. - **No GPU occlusion or LOD.** That's the game's job. The engine provides the tool; the game decides what to draw. ## Get the build Pre-built editor binaries with `agent_skinned` baked in — no engine compile required: | Platform | Binary | Engine commit | |----------|--------|---------------| | **macOS ARM64** | [`tinqs.macos.editor.arm64.mono`](https://tinqs.com/tinqs/builds/media/branch/main/engine/macos-arm64/tinqs.macos.editor.arm64.mono) | `4fe1323` (4.6.4, Xcode 26.3) | | **Windows x64** | [`tinqs.windows.editor.x86_64.mono.exe`](https://tinqs.com/tinqs/builds/media/branch/main/engine/windows-x64/tinqs.windows.editor.x86_64.mono.exe) | `64fb5cc` (4.6.4, MSVC 2022) | All builds live in the public [`tinqs/builds`](https://tinqs.com/tinqs/builds) repo — engine source is private, but the binaries are yours. See [`manifest.json`](https://tinqs.com/tinqs/builds/src/branch/main/manifest.json) for checksums and build details. The engine source lives in [`tinqs/engine`](https://tinqs.com/tinqs/engine) (private). Module docs: `modules/agent_skinned/README.md` and `.agents/wiki/agent-skinned-gpu-herd.md`. --- **Related:** [Fork, Don't Build](fork-dont-build) — why we modify existing platforms instead of building new ones. [Streaming a 12km Archipelago in Godot 4](godot-optimisation) — the terrain and vegetation streaming layers that work alongside this.