diff --git a/agent-harness.html b/agent-harness.html index ea6bca6..711d43d 100644 --- a/agent-harness.html +++ b/agent-harness.html @@ -262,43 +262,74 @@ ← All Posts 25 May 2026

What an Agent Harness Is and Why Game Dev Needs One

-

Open Claude or ChatGPT right now and ask it to review your last PR. It'll say "I don't have access to your repository." Ask it to take a screenshot of your game. It'll say "I can't interact with your operating system." Ask it what you were working on yesterday. It'll say "I don't have memory of previous conversations."

+

Open Claude or ChatGPT right now and ask it to review your last PR. It'll say "I don't have access to your repository." Ask it to take a screenshot of your game. It'll say "I can't interact with your operating system." Ask it what you were working on yesterday. It'll say "I don't have memory of previous conversations." + +A raw AI model is a brain without hands, eyes, or memory. An agent harness is the layer that gives it all three — plus identity, tools, and guardrails. And game development needs one that understands binary assets, visual pipelines, and spatial systems. + +## What a harness provides + +Every agent harness, regardless of domain, needs five things: + +Identity. Who the agent is, what it values, how it should behave. Not "you are a helpful assistant" — that's generic and unmoored. A soul file that says "you're working on Ariki, a survival colony sim. The team is four people. Never push to main without review. Prefer existing conventions." Identity creates consistency across sessions. + +Memory. What happened last session. What decisions were made. What failed and why. Without memory, every conversation is a cold start — "let me explain the project..." Memory stored as markdown in git means it's version-controlled, diffable, and human-readable. When something goes wrong, you git log instead of debugging a vector database. + +Tools. What the agent can actually do beyond generating text. A CLI that takes screenshots, checks service health, and loads project context. API wrappers for git, CI, image generation. Without tools, the agent is a very articulate oracle that can't touch anything. + +Context. Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — tinqs identity — returns all of this in 100ms. No re-reading the README. No "what repo are we in?" + +Guardrails. What the agent must never do. No merging to main without review. No pushing to public repos without approval. No running destructive commands. The harness enforces these at the platform layer, not in the prompt. Prompts can be ignored. Platform gates cannot. + +## Why generic harnesses fail for game dev + +LangChain, CrewAI, and AutoGen are built for web apps. They assume text-in, text-out. Game development is different in ways that break those assumptions: + +Assets are binary. A web PR is a text diff. A game PR is a 150MB GLB file with textures, rigging, and animations. You can't review it without seeing it. Our harness renders 3D models in the browser during code review — rotate, zoom, check materials. The artist pushes, the lead inspects, no downloads required. + +The pipeline is visual. Concept art → 3D model → rigged character → in-engine asset. Each step uses different tools. The harness needs to orchestrate image generators, 3D modellers, auto-riggers, and game engines as a single workflow — not as five separate API calls the human has to stitch together. + +Scale is physical. A web app's complexity is in business logic. A game's complexity is in geometry — 12km worlds, 155 vegetation types, 2,000 crowd instances. The agent needs to understand spatial systems, GPU memory budgets, and frame timing. "Add more RAM" isn't an answer when you have 8GB of VRAM. + +The team is small and cross-functional. Four people. No dedicated DevOps, no dedicated artist, no dedicated PM. The harness fills all those gaps, not just one. + +## The toolchain that makes it work + +Our harness runs on Tinqs Studio, built on a Gitea fork with game-specific features. The key pieces: + +The CLI — a single Go binary. One command (tinqs identity) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary. + +The soul file — a markdown document in the repo root. The agent reads it on session start. It defines values, scope, and behavioural rules. The same soul file works in Cursor, Claude Code, or any tool that reads markdown. + +Skills — markdown playbooks for specific workflows. Image generation, concept art pipeline, 3D model creation, video generation. Each skill is a procedure the agent follows. Write once, use forever. + +3D preview — click a .glb file in a PR and rotate the model in your browser. 22 formats supported. This alone transformed our review process — nobody approves a binary diff blind anymore. + +Guardrails — agents can file issues, draft announcements, generate assets, and write code. They cannot merge, deploy, or push to public repos without human approval. Branch protection rules enforced at the git platform layer. + +## The cold-start problem, solved + +Every AI agent session starts blank. Most teams solve this with long system prompts — but when your context is 200 markdown files, 15 skills, and 3 years of project history, you can't paste all of that. + +The harness uses staged loading: + +1. CLI identity call (100ms) — soul file, company context, machine info, service status +2. Memory file (instant) — cross-session context from the docs repo +3. Skills (on demand) — loaded only when the task matches a skill name +4. Repo context (on demand) — files read as needed, not all upfront + +Agent goes from cold to fully contextual in under a second. No "let me explain the project." No re-reading onboarding docs. Just start working. + +## The bet + +The gap between "I have an AI model" and "I have an AI team member" is infrastructure. Identity, memory, tools, context, guardrails. For game development, that infrastructure needs to understand binary assets, visual pipelines, and spatial systems. + +We're betting that specialised harnesses beat generic ones. A harness built for game dev — with 3D preview, LFS management, and creative pipelines — will outperform a general-purpose agent framework on game dev tasks. Not because the AI is smarter, but because it has the right hands, eyes, and memory for the job. + +— + +Tinqs Studio is an agent harness for game development — git hosting, AI agents, creative pipelines. Open for teams. We're building Ariki with the same tools.

-

A raw AI model is a brain without hands, eyes, or memory. An agent harness is the layer that gives it all three — plus identity, tools, and guardrails. And game development needs one that understands binary assets, visual pipelines, and spatial systems.

-

What a harness provides

-

Every agent harness, regardless of domain, needs five things:

-

Identity. Who the agent is, what it values, how it should behave. Not "you are a helpful assistant" — that's generic and unmoored. A soul file that says "you're working on Ariki, a survival colony sim. The team is four people. Never push to main without review. Prefer existing conventions." Identity creates consistency across sessions.

-

Memory. What happened last session. What decisions were made. What failed and why. Without memory, every conversation is a cold start — "let me explain the project..." Memory stored as markdown in git means it's version-controlled, diffable, and human-readable. When something goes wrong, you git log instead of debugging a vector database.

-

Tools. What the agent can actually do beyond generating text. A CLI that takes screenshots, checks service health, and loads project context. API wrappers for git, CI, image generation. Without tools, the agent is a very articulate oracle that can't touch anything.

-

Context. Which project this is. Who's asking. What machine they're on. What services are reachable. A single CLI call — tinqs identity — returns all of this in 100ms. No re-reading the README. No "what repo are we in?"

-

Guardrails. What the agent must never do. No merging to main without review. No pushing to public repos without approval. No running destructive commands. The harness enforces these at the platform layer, not in the prompt. Prompts can be ignored. Platform gates cannot.

-

Why generic harnesses fail for game dev

-

LangChain, CrewAI, and AutoGen are built for web apps. They assume text-in, text-out. Game development is different in ways that break those assumptions:

-

Assets are binary. A web PR is a text diff. A game PR is a 150MB GLB file with textures, rigging, and animations. You can't review it without seeing it. Our harness renders 3D models in the browser during code review — rotate, zoom, check materials. The artist pushes, the lead inspects, no downloads required.

-

The pipeline is visual. Concept art → 3D model → rigged character → in-engine asset. Each step uses different tools. The harness needs to orchestrate image generators, 3D modellers, auto-riggers, and game engines as a single workflow — not as five separate API calls the human has to stitch together.

-

Scale is physical. A web app's complexity is in business logic. A game's complexity is in geometry — 12km worlds, 155 vegetation types, 2,000 crowd instances. The agent needs to understand spatial systems, GPU memory budgets, and frame timing. "Add more RAM" isn't an answer when you have 8GB of VRAM.

-

The team is small and cross-functional. Four people. No dedicated DevOps, no dedicated artist, no dedicated PM. The harness fills all those gaps, not just one.

-

The toolchain that makes it work

-

Our harness runs on Tinqs Studio, built on a Gitea fork with game-specific features. The key pieces:

-

The CLI — a single Go binary. One command (tinqs identity) gives the agent full project context in 100ms. Screenshots, cloud vision, health checks — all subcommands of the same binary.

-

The soul file — a markdown document in the repo root. The agent reads it on session start. It defines values, scope, and behavioural rules. The same soul file works in Cursor, Claude Code, or any tool that reads markdown.

-

Skills — markdown playbooks for specific workflows. Image generation, concept art pipeline, 3D model creation, video generation. Each skill is a procedure the agent follows. Write once, use forever.

-

3D preview — click a .glb file in a PR and rotate the model in your browser. 22 formats supported. This alone transformed our review process — nobody approves a binary diff blind anymore.

-

Guardrails — agents can file issues, draft announcements, generate assets, and write code. They cannot merge, deploy, or push to public repos without human approval. Branch protection rules enforced at the git platform layer.

-

The cold-start problem, solved

-

Every AI agent session starts blank. Most teams solve this with long system prompts — but when your context is 200 markdown files, 15 skills, and 3 years of project history, you can't paste all of that.

-

The harness uses staged loading:

-

1. CLI identity call (100ms) — soul file, company context, machine info, service status

-

2. Memory file (instant) — cross-session context from the docs repo

-

3. Skills (on demand) — loaded only when the task matches a skill name

-

4. Repo context (on demand) — files read as needed, not all upfront

-

Agent goes from cold to fully contextual in under a second. No "let me explain the project." No re-reading onboarding docs. Just start working.

-

The bet

-

The gap between "I have an AI model" and "I have an AI team member" is infrastructure. Identity, memory, tools, context, guardrails. For game development, that infrastructure needs to understand binary assets, visual pipelines, and spatial systems.

-

We're betting that specialised harnesses beat generic ones. A harness built for game dev — with 3D preview, LFS management, and creative pipelines — will outperform a general-purpose agent framework on game dev tasks. Not because the AI is smarter, but because it has the right hands, eyes, and memory for the job.

-
-

Tinqs Studio is an agent harness for game development — git hosting, AI agents, creative pipelines. Open for teams. We're building Ariki with the same tools.

diff --git a/agentic-workflow.html b/agentic-workflow.html index 3271d5a..be3fa61 100644 --- a/agentic-workflow.html +++ b/agentic-workflow.html @@ -286,10 +286,10 @@

Agents don't just have instructions. They have skills — markdown playbooks that teach specific workflows. When someone says "generate concept art for a character," the agent reads skills/image-generation.md and follows the procedure. No prompt engineering per session. No "let me try a different prompt."

We've open-sourced several skills:

Each skill took about 30 minutes to write. After six months, our agents have 15+ skills covering art generation, competitive research, video production, and project management. Skills compound — every playbook you write makes every future session more capable.

What the agents actually do, every day

@@ -317,7 +317,7 @@

Skills compound exponentially. One skill saves 15 minutes per session. Fifteen skills save hours per day across the whole team. The investment curve is absurdly favourable — 30 minutes of writing per skill, compounding returns forever.

We're four people. With agents doing the mechanical work, we operate like forty. Not because the AI is magic — because we gave it identity, memory, and the right playbooks, and then got out of its way.


-

We're building Ariki, a survival colony sim, using the same agent workflow described here. Everything runs on Tinqs Studio — a game dev platform with built-in AI agents, git hosting, and creative pipelines.

+

We're building Ariki, a survival colony sim, using the same agent workflow described here. Everything runs on Tinqs Studio — a game dev platform with built-in AI agents, git hosting, and creative pipelines.

diff --git a/blog-visual-upgrade.html b/blog-visual-upgrade.html index 73282e8..61abd8a 100644 --- a/blog-visual-upgrade.html +++ b/blog-visual-upgrade.html @@ -303,7 +303,7 @@ Done.

Build systems make CSS changes safe. Because we never hand-edit .html, every style change is tested by regenerating all pages and grepping for the new selectors. If a rule doesn't ship, you know immediately.

Two gaps we'll fill later: blockquote support in build.js (the callout CSS is waiting) and ordered lists (same story). In the meantime, the blog already looks intentional — and it took two template files, one build step, and zero dependencies.


-

The blog is generated by build.js and served by Tinqs Studio. All styling is self-contained in the templates.

+

The blog is generated by build.js and served by Tinqs Studio. All styling is self-contained in the templates.

diff --git a/build.js b/build.js index 4c249be..483b543 100644 --- a/build.js +++ b/build.js @@ -144,11 +144,11 @@ function inline(s) { s = s.replace(/\*(.+?)\*/g, "$1"); // Inline code s = s.replace(/`([^`]+)`/g, "$1"); - // Links - s = s.replace(/\[([^\]]+)\]\(([^)]+)\)/g, '$1'); - // Em dash + // Em dash / en dash — before links so CSS var(--x) in style attr isn't mangled s = s.replace(/---/g, "—"); s = s.replace(/--/g, "–"); + // Links + s = s.replace(/\[([^\]]+)\]\(([^)]+)\)/g, '$1'); return s; } diff --git a/fal-image-generation.html b/fal-image-generation.html index a0a632c..0ed7171 100644 --- a/fal-image-generation.html +++ b/fal-image-generation.html @@ -335,10 +335,10 @@ No extra fingers, no merged limbs, no floating accessories.

The design context block is worth more than the rest of the prompt combined. Without it, every image is a one-off. With it, every image belongs to the same game.

Never iterate on expensive models. Schnell at $0.003/image is for exploration. Flux 2 Pro at $0.03 is for final output. The cheap model does 90% of the creative work.

Aggregation beats loyalty. No single model is best at everything. Flux for art, Ideogram for text, Recraft for design, Nano Banana for edits, BiRefNet for masks. Use the right tool for each job.

-

Let the agent handle prompting. We encode the 4-layer pattern, art style guide, and model selection rules in an agent skill file. The AI writes the full prompt, generates images, displays them, and asks for scores. The human's job is creative direction.

+

Let the agent handle prompting. We encode the 4-layer pattern, art style guide, and model selection rules in an agent skill file. The AI writes the full prompt, generates images, displays them, and asks for scores. The human's job is creative direction.

AI art isn't magic and it isn't free. But at a penny per image, with the right prompt structure and model strategy, it eliminates the most expensive bottleneck in indie game development: the gap between "I know what this should look like" and "I have an asset I can actually use."


-

Image generation is built into Tinqs Studio. We've open-sourced the prompt engineering skill and concept art pipeline skill. We're building Ariki with these tools.

+

Image generation is built into Tinqs Studio. We've open-sourced the prompt engineering skill and concept art pipeline skill. We're building Ariki with these tools.

diff --git a/flows-are-sessions.html b/flows-are-sessions.html new file mode 100644 index 0000000..3e4294e --- /dev/null +++ b/flows-are-sessions.html @@ -0,0 +1,354 @@ + + + + + + + Flows Are Sessions, Not Pipelines: Why We Moved Our Agent Orchestrator from YAML to JavaScript — Tinqs Blog + + + + + + + + + + + + + + + + + + + + + + +
+ ← All Posts + 11 June 2026 +

Flows Are Sessions, Not Pipelines: Why We Moved Our Agent Orchestrator from YAML to JavaScript

+

Our YAML flow engine had seven bespoke node types just to fake a while loop. We threw it out and rewrote everything in 200 lines of JavaScript. The flow engine is gone. The flow IS the session. Here's what we learned.

+ +
+

The YAML Was a Compiler for a Language Nobody Wanted

+

The old system was a static DAG. You defined nodes and edges in YAML, the engine walked them top-to-bottom, and when it finished it was done. No mid-run interaction. No branching. No retry. If you wanted a loop, you didn't use a while statement — you used an agent-loop-decision node type. In YAML.

+
# The old way: a "loop" was a bespoke node type
+steps:
+  - agent-task:   "generate document"
+  - agent-loop-decision:
+      condition:  "check if quality > 0.8"
+      if-true:    "continue"
+      if-false:   "repeat-step: agent-task"
+

That's not configuration. That's a compiler for a language nobody wanted to write. Every orchestrator ends up here — GitHub Actions expressions, GitLab CI rules, Airflow's BranchPythonOperator, all of them start as "simple YAML config" and grow node types until they're Turing-complete nightmares held together by schema patches.

+

We had seven: agent-task, agent-loop-decision, fork, conditional, agent-join, pipeline-stage, human-review. Each one existed because YAML can't express control flow. You weren't writing a flow. You were filing paperwork to describe a flow.

+

The moment we knew it was wrong: someone asked "can I retry this step three times if it fails?" and the honest answer was "we'd need a new node type." When your config format needs an RFC to add a for loop, you've built a programming language by accident. Delete it.

+

200 Lines of JavaScript Replaced Seven Node Types

+

A flow is now an ES module. You export default async (flow) => {} and the runtime calls it. The API surface is five calls:

+ +

Here's a real flow. It reviews a code route change with parallel researchers and a human gate. Ten lines.

+
// .pi/flows/flows/review-routes.flow.mjs
+export const meta = { name: "review-routes", description: "audit routes for missing auth" };
+export default async (flow) => {
+  const { agent, parallel, phase, human, task } = flow;
+  phase("find");
+  const findings = await parallel(["auth", "input"].map((d) => () =>
+    agent(`Review ${d}: ${task}`, { agent: "researcher", model: "@planning" })
+  ));
+  phase("review");
+  const gate = await human({ title: "Eyeball it", prompt: "anything to fix before I finish?" });
+  return { findings, gate };
+};
+

Why JavaScript wins is boring and fundamental: it has while, if, try/catch, and parallel(thunks) built in. The things our YAML needed custom schema types to fake are just language keywords. Bounded concurrency is a one-liner. Error recovery is a try block. A loop is while (!approved). No plugin, no RFC, no new node type.

+

We migrated all flows YAML-to-JS on 2026-06-10. One-way conversion script, every flow reviewed and running in 24 hours. The YAML parser was deleted — not deprecated, not kept for backwards compatibility, deleted. There's no config path left to reach for.

+

The plan lives in code, not config. Config is for things that don't change. Agent orchestration changes every run.

+

A Flow IS a Session

+

The old model had a phantom problem. A flow was a card. A session was a separate card. The operator watched the flow run from a different window. There was a "New Flow" button that created a flow card, and a "Continue" button that attached a session to it, and the disconnect between "the thing running your work" and "the place you talk to it" was baked into the UI.

+

We killed that architecture. Every spawn is a session.

+

When you call POST /api/flows/spawn {cwd, task, flowName}, the session runs the flow inside itself with the flow_run tool. Steps stream inline into the chat — flow:steps injects progress into the session's own message stream. The session turns purple in the dashboard. It becomes the flow. One card. One identity. Persistent after the run finishes.

+

No "New Flow" button. No "Continue" button. You spawn a session; with a flowName it runs that flow, without one it opens an interactive operator session that designs a flow with you first. The dashboard branding is tinqs Studio. The control surface is the host card, live agent cards, run history, and a chat to steer the run.

+

There's no phantom card. No disconnect between "the thing running your work" and "the place you talk to it." It's one session. You're in the room.

+

The Human Gate: Pause, Take Over, Approve, Continue

+

Agents make mistakes. They guess when they shouldn't. They take irreversible actions because the prompt said "proceed." The model gets stuck and you sit there wondering — do I wait or abort?

+

flow.human() is our answer.

+

When a flow hits a human gate, it stops. The dashboard shows the gate prompt: what to review, what to decide. The host session switches to takeover mode — coding tools are unblocked (normally the host is hands-off) and the system prompt becomes "flow is paused, help the operator finish this." You open files. You edit. You verify. You run commands.

+

To release the gate, reply approve or done or lgtm. The flow resumes. Any other message is a work instruction — the takeover session executes it but does not release the gate. The flow loops on notes until the human says go.

+
let approved = false;
+while (!approved) {
+  const { notes } = await human({
+    title: "Review before push",
+    prompt: "Check the diff. Approve or tell me what to fix.",
+  });
+  if (notes.match(/^(approve|done|lgtm|looks good)/i)) approved = true;
+  else await agent(`Fix: ${notes}`, { agent: "implementer" });
+}
+

Two patterns this enables. First: review-approve-before-push gates — nobody ships untested code because nobody set the auto-approve flag to true. Second: the "agent is stuck" hand-off — the flow pauses, you take over the exact same workspace, fix the problem, type continue, and the flow keeps going.

+

The flow waits instead of guessing. This isn't a feature. It's an admission that some decisions shouldn't be automated.

+

The Operator Is Your Co-Pilot

+

The old way was a one-shot generator. Paste an objective, click Generate, get a YAML blob, pray it's right, run it, discover it's wrong 20 minutes in with no way to steer. We'd watch flows fail and think "I could have told it that before it started."

+

The new flow operator doesn't write the flow for you and walk away. It designs it with you.

+

Hit New Flow. A DeepSeek session opens — the same model that powers the dashboard, cheap at $0.28/MTok, steerable in natural language. It proposes a draft .flow.mjs, shows you the agents and phases and any human gate, and explains why. You tell it what to change. It does NOT launch until you say go. When you approve, it writes the flow file and spawns a separate host session, then attaches to monitor and report progress.

+

The operator is still in the chat when the human gate fires. It's still there when you want to change the plan mid-run. It doesn't go away after launch. Co-pilot, not autopilot.

+

There are three runner faces for the same engine: pi/dashboard (DeepSeek, cheap, steerable — the default), Claude Code (Workflow tool, one-shot fan-out for heavy research), and a cloud agent (remote deploy, clone, AWS). Pick by granularity and cost. The flow file is the same in all three modes.

+

What We Learned

+

Numbers first. 43 out of 43 unit tests green. All flows migrated. The supervisor inbox — steering messages sent between steps — was silently dropping operator messages before 2026-06-10. You'd type "focus on the auth routes" and the flow never saw it. That's fixed now. Chat reaches the inbox, the inbox drains between agent() calls, and steering works.

+

The inbox rule. The supervisor inbox is applied between agent() calls, never mid-step. Steering mid-step is undefined behaviour. We learned this the hard way — early versions tried to inject mid-agent and got corrupted state, partial outputs, agents that forgot what they were doing. Between steps is the right boundary. Respect it.

+

Economics matter. DeepSeek V4 Pro at $0.28/MTok runs per-step. Per-step model override lets you swap in a premium reasoning model ($15/MTok) for the one critical call that needs it. $0.28 for routine, $15 for the hard parts. The three-tier strategy from our agent daemon applies at flow granularity too.

+

What's next. Richer on-card flow display — a pinned step strip so you can see progress without opening the session. Attachable asset and agent-structure viewers in the flow card. Run replay for finished sessions after a page reload (the session persists, but you can't rewatch the stream yet).

+

But the principle is settled. A flow isn't a pipeline. A pipeline runs blind and reports back later. A flow is a pair-programming session where one of the pair happens to be code.

+
+

Tinqs Studio is our agent-native development platform — git hosting, AI agents, and the flow engine described here. Ariki is the survival colony sim we're building with it.

+ +
+ +
+ + +
+
+ + + diff --git a/fork-dont-build.html b/fork-dont-build.html index ce7a743..81bc66c 100644 --- a/fork-dont-build.html +++ b/fork-dont-build.html @@ -276,7 +276,7 @@

Across three forks, we've never touched more than 0.5% of upstream code. If your fork hits 1%, you're doing too much — either the upstream tool is wrong for the job, or you're not trusting it enough.

Fork 1: Gitea → Tinqs Studio

Gitea is a self-hosted git server. Single Go binary, MIT license, 45k GitHub stars. We used GitHub for two years. It was fine for docs. For the game repo — 12GB in LFS, growing weekly — it was untenable. LFS bandwidth limits, slow clones, $5/50GB pricing. And nobody on the team could see what changed. A PR modifying a .glb file showed a binary diff. No preview. The artist pushed, the developer approved blindly, and three days later someone noticed the normals were inverted.

-

We forked Gitea and built Tinqs Studio. Our changes:

+

We forked Gitea and built Tinqs Studio. Our changes:

3D asset preview. Click a .glb, .gltf, or .fbx file in a PR and rotate the model in your browser. 22 formats supported via O3DV. This alone transformed our review process — the artist pushes, the lead inspects, nobody downloads anything.

HTML file preview. Sandboxed iframe rendering. Our internal docs and game design pages look like websites, not raw source.

Agent API. Six REST endpoints that let AI agents submit tasks, push code, check CI status, and open PRs. Agents are first-class users of the git platform, not bolt-on tools.

@@ -285,7 +285,7 @@

Total lines changed: about 2,000 out of Gitea's 500,000. We modify templates, add Go modules, tweak CSS. We never touch the database schema — upstream owns that, and we ride their migrations.

The alternative was building a git platform from scratch. Multi-year project, multi-million dollar budget. Or using GitHub/GitLab and accepting their limitations. Neither gives you the ability to embed agents directly into the platform.

Fork 2: Pi → Agent Runtime with Game Tools

-

Pi is an open-source coding agent by Mario Zechner. MIT license, TypeScript, minimal by design — four core tools (read, write, edit, bash) and an extension system. 51k stars.

+

Pi is an open-source coding agent by Mario Zechner. MIT license, TypeScript, minimal by design — four core tools (read, write, edit, bash) and an extension system. 51k stars.

We forked it and added four extensions, each a single TypeScript file: