blog/posts/flows-are-sessions.md at f42c76308c99abf72d9c87f88efedfed598f336b

tinqs/blog

Fork 0

Files

T

ozan b8c3fc473b post: GPU-skinned herds — agent_skinned renderer + engine private, builds public

2026-06-14 01:19:46 +01:00

10 KiB

Raw Blame History

title, slug, date, description, og_description, og_image, excerpt, author, author_initials, author_role

title	slug	date	description	og_description	og_image	excerpt	author	author_initials	author_role
Flows Are Sessions, Not Pipelines: Why We Moved Our Agent Orchestrator from YAML to JavaScript	flows-are-sessions	2026-06-11	We killed the static YAML DAG and rewrote our agent orchestration in 200 lines of JavaScript. Now a flow IS a session — you chat it, steer it, and it pauses for you at a human gate.	YAML DAGs are dead. We rewrote our agent orchestration in JavaScript, made every flow a live session, and added a human-in-the-loop gate. The operator is the co-pilot, not the babysitter.	https://www.tinqs.com/img/og-cover.jpg	We killed the static YAML DAG and rewrote our agent orchestration in 200 lines of JavaScript. Now a flow IS a session — you chat it, steer it, and it pauses for you at a human gate.	Ozan Bozkurt	OB	CTO & Developer, Tinqs

Our YAML flow engine had seven bespoke node types just to fake a while loop. We threw it out and rewrote everything in 200 lines of JavaScript. The flow engine is gone. The flow IS the session. Here's what we learned.

The YAML Was a Compiler for a Language Nobody Wanted

The old system was a static DAG. You defined nodes and edges in YAML, the engine walked them top-to-bottom, and when it finished it was done. No mid-run interaction. No branching. No retry. If you wanted a loop, you didn't use a while statement — you used an agent-loop-decision node type. In YAML.

# The old way: a "loop" was a bespoke node type
steps:
  - agent-task:   "generate document"
  - agent-loop-decision:
      condition:  "check if quality > 0.8"
      if-true:    "continue"
      if-false:   "repeat-step: agent-task"

That's not configuration. That's a compiler for a language nobody wanted to write. Every orchestrator ends up here — GitHub Actions expressions, GitLab CI rules, Airflow's BranchPythonOperator, all of them start as "simple YAML config" and grow node types until they're Turing-complete nightmares held together by schema patches.

We had seven: agent-task, agent-loop-decision, fork, conditional, agent-join, pipeline-stage, human-review. Each one existed because YAML can't express control flow. You weren't writing a flow. You were filing paperwork to describe a flow.

The moment we knew it was wrong: someone asked "can I retry this step three times if it fails?" and the honest answer was "we'd need a new node type." When your config format needs an RFC to add a for loop, you've built a programming language by accident. Delete it.

200 Lines of JavaScript Replaced Seven Node Types

A flow is now an ES module. You export default async (flow) => {} and the runtime calls it. The API surface is five calls:

agent(prompt, options) — run one agent with a task
parallel(thunks) — run many agents concurrently, await all
pipeline(items, ...stages) — push items through stages
phase(name) — label progress for the dashboard
human(config) — pause and wait for a person

Here's a real flow. It reviews a code route change with parallel researchers and a human gate. Ten lines.

// .pi/flows/flows/review-routes.flow.mjs
export const meta = { name: "review-routes", description: "audit routes for missing auth" };
export default async (flow) => {
  const { agent, parallel, phase, human, task } = flow;
  phase("find");
  const findings = await parallel(["auth", "input"].map((d) => () =>
    agent(`Review ${d}: ${task}`, { agent: "researcher", model: "@planning" })
  ));
  phase("review");
  const gate = await human({ title: "Eyeball it", prompt: "anything to fix before I finish?" });
  return { findings, gate };
};

Why JavaScript wins is boring and fundamental: it has while, if, try/catch, and parallel(thunks) built in. The things our YAML needed custom schema types to fake are just language keywords. Bounded concurrency is a one-liner. Error recovery is a try block. A loop is while (!approved). No plugin, no RFC, no new node type.

We migrated all flows YAML-to-JS on 2026-06-10. One-way conversion script, every flow reviewed and running in 24 hours. The YAML parser was deleted — not deprecated, not kept for backwards compatibility, deleted. There's no config path left to reach for.

The plan lives in code, not config. Config is for things that don't change. Agent orchestration changes every run.

A Flow IS a Session

The old model had a phantom problem. A flow was a card. A session was a separate card. The operator watched the flow run from a different window. There was a "New Flow" button that created a flow card, and a "Continue" button that attached a session to it, and the disconnect between "the thing running your work" and "the place you talk to it" was baked into the UI.

We killed that architecture. Every spawn is a session.

When you call POST /api/flows/spawn {cwd, task, flowName}, the session runs the flow inside itself with the flow_run tool. Steps stream inline into the chat — flow:steps injects progress into the session's own message stream. The session turns purple in the dashboard. It becomes the flow. One card. One identity. Persistent after the run finishes.

No "New Flow" button. No "Continue" button. You spawn a session; with a flowName it runs that flow, without one it opens an interactive operator session that designs a flow with you first. The dashboard branding is tinqs Studio. The control surface is the host card, live agent cards, run history, and a chat to steer the run.

There's no phantom card. No disconnect between "the thing running your work" and "the place you talk to it." It's one session. You're in the room.

The Human Gate: Pause, Take Over, Approve, Continue

Agents make mistakes. They guess when they shouldn't. They take irreversible actions because the prompt said "proceed." The model gets stuck and you sit there wondering — do I wait or abort?

flow.human() is our answer.

When a flow hits a human gate, it stops. The dashboard shows the gate prompt: what to review, what to decide. The host session switches to takeover mode — coding tools are unblocked (normally the host is hands-off) and the system prompt becomes "flow is paused, help the operator finish this." You open files. You edit. You verify. You run commands.

To release the gate, reply approve or done or lgtm. The flow resumes. Any other message is a work instruction — the takeover session executes it but does not release the gate. The flow loops on notes until the human says go.

let approved = false;
while (!approved) {
  const { notes } = await human({
    title: "Review before push",
    prompt: "Check the diff. Approve or tell me what to fix.",
  });
  if (notes.match(/^(approve|done|lgtm|looks good)/i)) approved = true;
  else await agent(`Fix: ${notes}`, { agent: "implementer" });
}

Two patterns this enables. First: review-approve-before-push gates — nobody ships untested code because nobody set the auto-approve flag to true. Second: the "agent is stuck" hand-off — the flow pauses, you take over the exact same workspace, fix the problem, type continue, and the flow keeps going.

The flow waits instead of guessing. This isn't a feature. It's an admission that some decisions shouldn't be automated.

The Operator Is Your Co-Pilot

The old way was a one-shot generator. Paste an objective, click Generate, get a YAML blob, pray it's right, run it, discover it's wrong 20 minutes in with no way to steer. We'd watch flows fail and think "I could have told it that before it started."

The new flow operator doesn't write the flow for you and walk away. It designs it with you.

Hit New Flow. A DeepSeek session opens — the same model that powers the dashboard, cheap at $0.28/MTok, steerable in natural language. It proposes a draft .flow.mjs, shows you the agents and phases and any human gate, and explains why. You tell it what to change. It does NOT launch until you say go. When you approve, it writes the flow file and spawns a separate host session, then attaches to monitor and report progress.

The operator is still in the chat when the human gate fires. It's still there when you want to change the plan mid-run. It doesn't go away after launch. Co-pilot, not autopilot.

There are three runner faces for the same engine: pi/dashboard (DeepSeek, cheap, steerable — the default), Claude Code (Workflow tool, one-shot fan-out for heavy research), and a cloud agent (remote deploy, clone, AWS). Pick by granularity and cost. The flow file is the same in all three modes.

What We Learned

Numbers first. 43 out of 43 unit tests green. All flows migrated. The supervisor inbox — steering messages sent between steps — was silently dropping operator messages before 2026-06-10. You'd type "focus on the auth routes" and the flow never saw it. That's fixed now. Chat reaches the inbox, the inbox drains between agent() calls, and steering works.

The inbox rule. The supervisor inbox is applied between agent() calls, never mid-step. Steering mid-step is undefined behaviour. We learned this the hard way — early versions tried to inject mid-agent and got corrupted state, partial outputs, agents that forgot what they were doing. Between steps is the right boundary. Respect it.

Economics matter. DeepSeek V4 Pro at $0.28/MTok runs per-step. Per-step model override lets you swap in a premium reasoning model ($15/MTok) for the one critical call that needs it. $0.28 for routine, $15 for the hard parts. The three-tier strategy from our agent daemon applies at flow granularity too.

What's next. Richer on-card flow display — a pinned step strip so you can see progress without opening the session. Attachable asset and agent-structure viewers in the flow card. Run replay for finished sessions after a page reload (the session persists, but you can't rewatch the stream yet).

But the principle is settled. A flow isn't a pipeline. A pipeline runs blind and reports back later. A flow is a pair-programming session where one of the pair happens to be code.

Tinqs Studio is our agent-native development platform — git hosting, AI agents, and the flow engine described here. Ariki is the survival colony sim we're building with it.

10 KiB Raw Blame History