← All Posts 14 June 2026

GPU-Skinned Herds: One Draw Call for 1,000 Animated Characters in Godot

Godot gives you one Skeleton3D per character. Want 200 animals in a herd? That's 200 skeleton nodes, 200 draw calls, and 200 AnimationPlayer ticks every frame. Want 1,000? Now you're measuring in seconds per frame, not frames per second.

We built a GPU skinned-instance renderer into Tinqs Engine that packs every pose into a single texture, uploads once, and draws every instance in one call. 25 crocodiles on screen right now. 1,000+ projected. Same bone count, same animation fidelity — a tiny fraction of the cost.

Why the engine needs to change

The standard Godot approach — one Skeleton3D + one MeshInstance3D per character — works for a handful of animated entities. It breaks down hard at crowd scale:

The alternative — baking animations to vertex textures — works for static crowds but locks you out of per-instance variation. No blending, no phase offsets, no reactive behaviour.

What we need is simpler: share the skeleton, drive per-instance poses from a single animation, batch the draw call. That's what agent_skinned does.

How it works: two classes, one texture

The module lives in modules/agent_skinned/ inside Tinqs Engine. Two classes, one job:

MultiSkinnedMeshInstance3D — the data plane

Holds the CPU-side bone matrices. Allocates an ImageTexture of size [4 × max_bones, max_instances] in RGBA32F — each texel is one column of a 4×4 bone matrix. For a 130-bone crocodile with 256 instances:

Texture: 520 × 256 RGBA32F ≈ 2 MB

That's the entire pose state for 256 animated crocodiles in a single GPU texture. The API is simple:

var data := MultiSkinnedMeshInstance3D.new()
data.set_mesh(crocodile_mesh)
data.set_skeleton(skeleton)       # rest pose + bone hierarchy
data.set_max_instances(256)
data.set_max_bones(130)

# Each frame: push poses from the animated skeleton
for instance in herd_positions:
    data.set_instance_pose_bones(instance.id, bone_transforms)
data.update()   # upload only dirty instances, not the whole texture

MultiSkinnedInstance3D — the renderer

A MultiMeshInstance3D subclass. Set its multimesh with the skinned mesh and instance transforms, point it at the data plane, call refresh() — it uploads the bone texture into the shader material's bone_matrices_tex uniform and the mesh is drawn in one call.

The shader does 4-bone linear-blend skinning on the GPU:

mat4 get_bone(int b) {
    return mat4(
        texelFetch(bone_matrices_tex, ivec2(b * 4 + 0, INSTANCE_ID), 0),
        texelFetch(bone_matrices_tex, ivec2(b * 4 + 1, INSTANCE_ID), 0),
        texelFetch(bone_matrices_tex, ivec2(b * 4 + 2, INSTANCE_ID), 0),
        texelFetch(bone_matrices_tex, ivec2(b * 4 + 3, INSTANCE_ID), 0)
    );
}

INSTANCE_ID is a Godot built-in — the GPU already knows which instance it's rendering. We just use it to index into the bone texture. No uniform arrays, no SSBOs, no compute shaders. Just a 2D texture and a custom vertex shader.

Two bugs we shipped and fixed

The module had data-plane doctests from day one — round-trip pose get/set, dirty tracking, size clamping, AABB. All green. Then we put it on screen for the first time and the crocodiles looked... wrong.

Bug 1: Shader compile failure. The default skinning shader compared TANGENT as vec4. Godot 4 exposes it as vec3. Fixed in one line, added albedo_tex uniform so herds texture out of the box.

Bug 2: Bone matrices stored transposed. The data plane wrote basis rows (standard Godot Transform3D.basis is row-major), but the shader unpacked as columns. Every bone matrix was transposed — the mesh crumpled. Not a scale bug, not an orientation bug — a layout mismatch. Fixed by storing column-major, with a doctest to prevent regression.

The lesson: doctests catch logic. Rendering catches truth. You need both.

What's driving it

In Ariki, the sim tracks animal migration across a 12km archipelago. AnimalHerdRenderer.cs groups sim ViewerState.animals by type, feeds positions to skinned_herd.gd (a reusable per-type herd backend), which drives the renderer. One AnimationPlayer animates a single driver skeleton; poses propagate to every instance.

The crocodile herd scene is 25 instances, one draw call. The same pipeline projects to 200–1,000 before the GPU budget even notices.

What's deliberately not here

Get the build

Pre-built editor binaries with agent_skinned baked in — no engine compile required:

| Platform | Binary | Engine commit |

|———-|——–|—————|

| macOS ARM64 | tinqs.macos.editor.arm64.mono | 4fe1323 (4.6.4, Xcode 26.3) |

| Windows x64 | tinqs.windows.editor.x86_64.mono.exe | 64fb5cc (4.6.4, MSVC 2022) |

All builds live in the public tinqs/builds repo — engine source is private, but the binaries are yours. See manifest.json for checksums and build details.

The engine source lives in tinqs/engine (private). Module docs: modules/agent_skinned/README.md and .agents/wiki/agent-skinned-gpu-herd.md.


Related: Fork, Don't Build — why we modify existing platforms instead of building new ones. Streaming a 12km Archipelago in Godot 4 — the terrain and vegetation streaming layers that work alongside this.