rewrite: refresh all blog posts for public audience
Merged overlapping posts: - forking-gitea + fork-dont-build → one post about the fork philosophy - fal-image-generation + image-generation-fal → one post about AI art pipeline Rewrote all posts with external/public voice: - Stronger hooks, concrete examples, punchier language - agentic-workflow: restructured around soul files + skills + numbers - agent-harness: clearer framing of 'what an agent harness is' - cloud-harness: tighter narrative about overnight agents - godot-optimisation: same depth, sharper opening - pre-commit-agent: clearer architecture, cost breakdown - studio-cli: reframed around identity/cold-start problem - blog-visual-upgrade: tightened the restyle story 10 posts total (9 markdown + 1 hand-authored HTML)
This commit is contained in:
+49
-71
@@ -2,115 +2,93 @@
|
||||
title: "Streaming a 12km Archipelago in Godot 4"
|
||||
slug: godot-optimisation
|
||||
date: "2026-05-22"
|
||||
description: "How we built four streaming layers, async resource loading, and memory-safe caches to run a 12km open world in Godot 4 with C#."
|
||||
og_description: "Four streaming layers, async loading, and zero memory leaks --- optimising Godot for a large open world."
|
||||
description: "Godot 4 has no built-in asset streaming. We built four layers — terrain regions, vegetation chunks, async loading, and entity rendering — to run a 12km open world with 9 islands, 155 vegetation types, and 2,000 crowd instances."
|
||||
og_description: "Four streaming layers, async loading, and zero memory leaks — running a 12km open world in Godot 4."
|
||||
og_image: "https://www.tinqs.com/img/og-cover.jpg"
|
||||
excerpt: "Four streaming layers, async resource loading, memory-safe caches, and zero leaks. How we built a 12km open world in Godot 4 with C#."
|
||||
excerpt: "Godot has no built-in asset streaming. We built four layers to run a 12km archipelago with 9 islands, 155 vegetation types, and 2,000 crowd instances — on an RTX 3060."
|
||||
author: "Ozan Bozkurt"
|
||||
author_initials: "OB"
|
||||
author_role: "CTO & Developer, Tinqs"
|
||||
---
|
||||
Godot has no built-in asset streaming. Our game is a 12km x 12km archipelago with 9 islands, thousands of trees, hundreds of buildings, and an ocean that never ends. Here's how we made it run.
|
||||
Godot 4 has no terrain streaming, no asset LOD pipeline, and no distance-based loading. Our game is a 12km × 12km archipelago with 9 islands, 155 vegetation prototypes, and 2,000 simulated colonists. If you load everything at startup, you run out of VRAM before the player sees the main menu.
|
||||
|
||||
## The Problem
|
||||
Here's how we built four streaming layers on top of Godot, all in C#, to make it work.
|
||||
|
||||
We're building a survival colony sim set across 9 islands. The total world is roughly 12km x 12km. Each island is 4km across with its own terrain heightmap, biome textures, vegetation prototypes, and building grids. The player can travel between islands by canoe.
|
||||
## The scale problem
|
||||
|
||||
Godot 4 is a fantastic engine, but it wasn't designed for this scale. There's no terrain streaming, no asset LOD pipeline, no distance-based loading. If you load everything at startup, you run out of VRAM before the player sees the main menu. So we built four streaming layers on top of Godot, all in C#.
|
||||
Each island is roughly 4km across with its own terrain heightmap, biome textures, vegetation, and building grids. The player travels between islands by canoe. At any given moment, only a small fraction of the world is visible — but Godot doesn't know that unless you tell it.
|
||||
|
||||
## Layer 1: Terrain Regions
|
||||
We built four layers that teach Godot what to load, when to load it, and when to let it go.
|
||||
|
||||
We use **Terrain3D** for heightmaps --- a GDExtension that gives us a clipmap renderer with 7 LOD levels. Internally, Terrain3D divides each island into 512m x 512m regions. A 4km island has 64 regions. Across 9 islands, that's 576 regions total.
|
||||
## Layer 1: Terrain regions (lazy instantiation)
|
||||
|
||||
The key insight: **don't create all 9 terrain nodes at startup.** Each node allocates a clipmap mesh, collision structures, and materials even when hidden. Our original code created all 9 in `_Ready()` and just toggled visibility. This wasted hundreds of megabytes on islands the player hadn't visited yet.
|
||||
We use **Terrain3D** for heightmaps — a GDExtension that provides clipmap rendering with 7 LOD levels. Each island is split into 512m × 512m regions. A 4km island has 64 regions. Nine islands: 576 regions total.
|
||||
|
||||
The fix was lazy instantiation. We create the current island's terrain on startup and defer the rest. When the player gets in a canoe and sails to a new island, we create that island's terrain node on demand, import the heightmap, and start async texture loading --- all while a loading screen covers the transition.
|
||||
The original code created all 9 terrain nodes in `_Ready()` and toggled visibility. This wasted hundreds of megabytes on islands the player hadn't visited. The fix: create the current island's terrain on startup, defer the rest. When the player sails to a new island, create that island's terrain node on demand, import the heightmap, start async texture loading — all behind a loading screen.
|
||||
|
||||
## Layer 2: Vegetation Chunks (128m Grid)
|
||||
## Layer 2: Vegetation chunks (128m grid)
|
||||
|
||||
This is the main prop streaming system. Every island's vegetation --- trees, rocks, grasses, shrubs --- is divided into a spatial grid of 128m x 128m chunks.
|
||||
The main prop streaming system. Every island's vegetation is divided into a spatial grid of 128m × 128m chunks.
|
||||
|
||||
The camera position is checked every 0.5 seconds. When it crosses a chunk boundary, we calculate which chunks should be active within a 400m radius (roughly 39 chunks in a circle), `QueueFree` chunks that fell out of range, and build new chunks that entered range.
|
||||
The camera position is checked every 0.5 seconds. When it crosses a chunk boundary, we calculate which chunks should be active within a 400m radius (~39 chunks), destroy chunks that fell out of range, and build new ones that entered. Each chunk groups vegetation by prototype, creates a **MultiMesh** per group, and places instances using height queries. A chunk with 50 palm trees and 30 rocks becomes 2 MultiMesh draw calls — not 80 individual nodes.
|
||||
|
||||
Each chunk groups vegetation instances by prototype, creates a **MultiMesh** per group, and places instances using height queries. A chunk with 50 palm trees and 30 rocks becomes 2 MultiMesh draw calls, not 80 individual nodes.
|
||||
The cache problem: vegetation meshes and materials are cached in dictionaries keyed by prototype name. These caches are append-only by default — visit all 9 islands and you accumulate every mesh variant permanently. The fix is island-scoped eviction. When the player leaves an island, we clear vegetation caches. They reload from disk on return, behind a loading screen.
|
||||
|
||||
### The cache problem
|
||||
## Layer 3: Async resource loading
|
||||
|
||||
Vegetation meshes and materials are cached in dictionaries keyed by prototype name. The problem: these caches are **append-only**. Visit all 9 islands and you accumulate every mesh and material variant permanently. With 155 unique prototypes across the archipelago, that's a lot of GPU memory that never gets freed.
|
||||
Godot's `GD.Load()` is synchronous. It blocks the main thread. During gameplay, the frame freezes.
|
||||
|
||||
The fix is island-scoped eviction. When the player leaves an island, we clear the vegetation caches. Meshes and materials for the departed island are released. If the player returns, they reload from disk. The loading screen covers this cost.
|
||||
We audited the entire codebase and found **26 resource load calls across 13 files** — only 1 was async. The worst offender was `GetMeshForProto()` in the vegetation grid. As the player walks across a new island, every new vegetation prototype triggers a synchronous load. With 155 prototypes, the first traversal stutters visibly.
|
||||
|
||||
## Layer 3: Async Resource Loading
|
||||
Two fixes:
|
||||
|
||||
Godot's `GD.Load()` is synchronous. It blocks the main thread. During gameplay, the frame freezes. We audited the entire codebase and found **26 resource load calls across 13 files**, and only 1 was async.
|
||||
- **Pre-warm during loading screens.** When an island is imported, kick off background loads for all known prototypes. By the time the player gains control, most meshes are cached.
|
||||
- **Async texture loading.** Terrain textures use `ResourceLoader.LoadThreadedRequest()` with `_Process()` polling. The terrain renders immediately with autoshader colors; biome textures pop in when ready.
|
||||
|
||||
The worst offender was `GetMeshForProto()` in the vegetation grid. As the player walks across an island for the first time, every new vegetation prototype triggers a synchronous load. With 155 prototypes, the first traversal stutters visibly.
|
||||
The ResourceLoader trap: Godot maintains an internal resource cache. Every `GD.Load()` caches the result globally. If you load an FBX as a `PackedScene`, instantiate it to extract a mesh, then free the instance — the PackedScene **stays cached**. Rule: use `ResourceLoader.Load(path, "", CacheMode.Ignore)` for one-shot loads where you extract data and discard the container.
|
||||
|
||||
We fixed this in two ways:
|
||||
## Layer 4: Entity rendering (event-driven)
|
||||
|
||||
- **Pre-warm during loading screens.** When an island is imported, we kick off background loads for all known prototypes. By the time the player gains control, most meshes are already cached.
|
||||
- **Async loading for biome textures.** Terrain textures use `ResourceLoader.LoadThreadedRequest()` with `_Process()` polling. The terrain renders immediately with autoshader colours, and biome textures pop in when ready. The player never notices.
|
||||
Dynamic entities — colonists, animals, buildings, VFX — update when the simulation pushes new state, not per frame.
|
||||
|
||||
### The ResourceLoader cache trap
|
||||
|
||||
On top of our own caches, Godot maintains an internal resource cache. Every `GD.Load()` call caches the result globally. There's no API to query the cache size or evict entries.
|
||||
|
||||
If you load an FBX as a `PackedScene`, instantiate it to extract a mesh, then free the instance --- the PackedScene **stays cached**. The mesh you extracted is fine (it's a Resource, not a Node), but the discarded scene wastes memory forever.
|
||||
|
||||
The rule: use `ResourceLoader.Load(path, "", CacheMode.Ignore)` for one-shot loads where you extract data and discard the container. Use `GD.Load()` only for things that should persist (shaders, shared textures).
|
||||
|
||||
## Layer 4: Entity Rendering
|
||||
|
||||
Dynamic entities --- colonists, animals, buildings, VFX --- are event-driven, not streamed. They update when the simulation pushes new state, not per frame.
|
||||
|
||||
- **Crowd rendering:** Single MultiMesh for up to 2000 colonists. Positions lerped per frame from pre-allocated arrays. Labels distance-culled, capped at 20. No individual nodes, no per-frame allocation.
|
||||
- **Animals:** One MultiMesh per type. Max 500 per type. Updates only on state change, not per frame.
|
||||
- **Buildings:** Tracked by ID from sim state. `QueueFree` when removed. Self-cleaning.
|
||||
- **Crowd rendering:** Single MultiMesh for up to 2,000 colonists. Positions lerped per frame from pre-allocated arrays. Labels distance-culled, capped at 20.
|
||||
- **Animals:** One MultiMesh per type. Max 500 per type. Updates only on state change.
|
||||
- **Buildings:** Tracked by ID from sim state. `QueueFree` when removed.
|
||||
- **VFX:** Capped at 50 active particle systems. Worst case: 10,000 GPU particles.
|
||||
|
||||
## Memory Safety: Zero Leaks
|
||||
## Memory safety: the QueueFree audit
|
||||
|
||||
We audited every `QueueFree()` call in the codebase --- 47 calls across 17 files. **Zero `RemoveChild()` calls without a corresponding `QueueFree()`.** Three patterns we follow everywhere:
|
||||
We audited every `QueueFree()` call — 47 calls across 17 files. **Zero `RemoveChild()` calls without a corresponding `QueueFree()`.** Three patterns we follow everywhere:
|
||||
|
||||
**Pattern 1: Chunk streaming** --- Deactivate out-of-range chunks by iterating the active dict, calling `QueueFree()`, collecting keys to remove, then removing them after iteration. Never modify a dictionary while iterating it.
|
||||
1. **Chunk streaming:** Iterate active dict, call `QueueFree()`, collect keys to remove, then remove after iteration. Never modify a dictionary while iterating.
|
||||
2. **Extract from PackedScene:** Instantiate, extract mesh, `QueueFree()` the temp instance. The mesh survives because it's a Resource, not a Node.
|
||||
3. **UI rebuild:** `QueueFree()` all children, build new content. Safe because `QueueFree` is deferred — new children added in same frame before old ones freed.
|
||||
|
||||
**Pattern 2: Extract data from PackedScene** --- Instantiate a scene, extract the mesh, `QueueFree()` the temporary instance. The mesh survives because it's a Resource, not a Node.
|
||||
## What runs every frame (and what doesn't)
|
||||
|
||||
**Pattern 3: UI rebuild** --- `QueueFree()` all children, then build new content. Safe because `QueueFree` is deferred --- new children are added in the same frame before old ones are freed.
|
||||
`_Process()` is strictly limited:
|
||||
|
||||
## What Runs Every Frame
|
||||
- Vegetation grid: camera chunk check (0.5s throttle, early-exit if same chunk)
|
||||
- Terrain manager: poll async texture loads
|
||||
- Crowd renderer: lerp 2,000 positions (math-only, pre-allocated arrays)
|
||||
- Day/night: rotate sun
|
||||
- Camera: follow + zoom
|
||||
- Sim bridge: drain WebSocket message queue
|
||||
|
||||
We're strict about what goes in `_Process()`:
|
||||
**No heap allocation in any of these.** Per-frame overhead is dominated by the crowd lerp and message queue drain.
|
||||
|
||||
- **Vegetation grid:** Camera chunk check (0.5s throttle, early-exits if same chunk)
|
||||
- **Terrain manager:** Poll async texture loads (loop pending list, check status)
|
||||
- **Crowd renderer:** Lerp 2000 colonist positions (math-only, pre-allocated arrays)
|
||||
- **Day/night:** Rotate sun, adjust light energy
|
||||
- **Camera:** Follow + zoom smoothing
|
||||
- **Sim bridge:** Drain WebSocket message queue
|
||||
Two shaders to watch: the ocean shader (4 Gerstner waves, depth reconstruction, caustics, foam — heaviest thing in the pipeline) and the wind sway shader (6 trig ops per vertex on every vegetation mesh within 400m). Future optimization: disable sway on distant chunks.
|
||||
|
||||
No heap allocation in any of these. Total per-frame overhead is dominated by the crowd lerp and the message queue drain.
|
||||
## Target: RTX 3060, 8GB VRAM
|
||||
|
||||
## Shaders We Watch
|
||||
- Main island + full vegetation < 4GB VRAM → ship it
|
||||
- Approaching 6-8GB → implement lazy terrain nodes + cache eviction
|
||||
- Exceeding 8GB → implement vegetation LOD and region-level streaming
|
||||
|
||||
Two custom shaders are performance-sensitive:
|
||||
|
||||
**Ocean shader** --- 4 Gerstner wave calculations in the vertex stage, applied to a 12,000m plane. Fragment stage does depth reconstruction, caustics, foam masking, and two normal map lookups. It's the heaviest thing in the render pipeline. We pre-warm it during the loading screen to avoid shader compilation stutter.
|
||||
|
||||
**Wind sway shader** --- 6 trig ops per vertex on every vegetation mesh within 400m. The sway is invisible beyond 100m but the shader runs at full cost regardless. Future optimisation: disable sway on distant chunks.
|
||||
|
||||
## The Target: RTX 3060
|
||||
|
||||
Our early access target is an RTX 3060 with 8GB VRAM:
|
||||
|
||||
- Main island + full vegetation < 4GB VRAM --- ship it, we have headroom
|
||||
- Approaching 6--8GB --- implement lazy terrain nodes + cache eviction
|
||||
- Exceeding 8GB --- implement vegetation LOD and region-level streaming
|
||||
|
||||
**Always measure before optimising.** We added VRAM logging before writing a single line of optimisation code. Half the "problems" we expected turned out to be non-issues. The other half were worse than expected. Profiling isn't optional.
|
||||
**Always measure before optimizing.** We added VRAM logging before writing a single line of optimization code. Half the "problems" we expected were non-issues. The other half were worse than expected. Profiling isn't optional.
|
||||
|
||||
---
|
||||
|
||||
Godot 4 can handle open worlds at this scale, but it won't do it for you. You need to build streaming, manage your own caches, audit your resource loading, and be disciplined about what runs per frame. The engine gives you the primitives --- MultiMesh, `LoadThreadedRequest`, `QueueFree` --- and it's up to you to wire them into a system that scales.
|
||||
Godot 4 can handle open worlds at this scale, but it won't do it for you. You need to build streaming, manage your own caches, audit resource loading, and be disciplined about what runs per frame. The engine gives you the primitives — MultiMesh, `LoadThreadedRequest`, `QueueFree`. It's up to you to wire them into a system that scales.
|
||||
|
||||
We're building with these systems and developing the game using [Tinqs Studio](https://tinqs.com). If you're building something large-scale in Godot, we hope this is useful.
|
||||
*We're building [Ariki](https://arikigame.com), a survival colony sim, with these systems. The tools we use — git hosting, AI agents, creative pipelines — are part of [Tinqs Studio](https://tinqs.com).*
|
||||
|
||||
Reference in New Issue
Block a user