Building a Cloud Agent Harness with DeepSeek V4 and Pi
We spent a few sessions building something that still barely exists elsewhere: a cloud agent harness where AI coding agents are first-class citizens of the platform, not bolt-on tools. The stack is a Pi fork for the brain, a Go orchestrator inside our Gitea fork for overnight work, and a browser dashboard merged into Pi for the daytime. Here is how it fits together.
The Problem
Every coding agent today — Claude Code, Codex, Pi, Aider — runs in your terminal. You watch it work. You close the laptop, it stops. There is no way to say "build these eight features overnight" and wake up to pull requests.
We wanted exactly that. Not a coding assistant. An autonomous workforce — with a UI when a human needs to be in the loop.
Why Not Just Use Claude Code or Codex?
Cost. Claude Code runs on Opus at $15/MTok output. Codex uses GPT 5.5. Running eight agents overnight on either would cost $50–200. DeepSeek V4 Flash costs $0.28/MTok output. Eight overnight tasks: about $0.80.
Control. Cloud tools are black boxes. We cannot add a Gitea API tool, a fal.ai image generator, or a guardrail that blocks aws ec2 terminate-instances. With our own harness, we add an extension and it is live.
Platform. We are building Tinqs Studio — a Gitea-based game development platform. Agents are not a feature we want to outsource. They are the product.
Pi — The Agent Brain
Pi is an open-source coding agent by Mario Zechner. MIT license, TypeScript, minimal by design — four core tools (read, write, edit, bash) and an extension system.
We forked it. Not to rewrite the core — to add first-party extensions:
- tinqs-provider — routes DeepSeek V4 Flash and Pro through our inference proxy
- tinqs-tools — Gitea REST API, fal.ai image generation, Amazon Nova Lite vision
- tinqs-ci — reads CI pipeline status, logs, and polls for completion
- tinqs-guardrail — 29 safety patterns that block dangerous operations
Each extension is a single TypeScript file. No extra npm dependencies on the extension side.
Pi has four output modes. The one that matters for automation is RPC — a headless process that accepts JSON on stdin/stdout. That is how the orchestrator drives it.
DeepSeek V4 — The LLM
DeepSeek V4 Flash through our own inference proxy. OpenAI-compatible API, so Pi treats it like any other provider. The proxy adds:
- Redis job queue (10 concurrent workers)
- Per-user usage tracking
- System prompt injection for cache hit optimization
- Gitea PAT authentication (same token as git push)
Cost per task: $0.02–0.10 depending on complexity.
Go Orchestrator — Overnight Batch Work
Inside tinqs/studio we added modules/agents/ — a Go worker pool that:
- Spawns Pi with
–mode rpc –no-session - Tracks task lifecycle (pending → running → done)
- Streams events over SSE to any connected UI
- Enforces guardrails at the platform layer (worker limits, timeouts)
Six HTTP endpoints, same auth as git push:
POST /api/v1/agents/tasks — submit a task
GET /api/v1/agents/tasks — list all tasks
GET /api/v1/agents/tasks/{id} — get task details
DELETE /api/v1/agents/tasks/{id} — stop a task
GET /api/v1/agents/stream — SSE live events
GET /api/v1/agents/health — orchestrator status
We considered bolting on a separate orchestration SaaS and rejected it. The orchestrator lives in the same binary as git — same auth, no extra service to deploy.
The intended loop:
Orchestrator reads task brief
→ spawns pi --mode rpc
→ Pi writes code using DeepSeek V4
→ Pi pushes branch, calls ci_wait
→ CI green → Pi opens PR via gitea_api
→ CI red → Pi reads ci_logs, fixes, retries
→ Human reviews PR, merges
Git worktree integration and full push/PR automation are still being wired; the API and worker pool already run locally.
Pi Dashboard — Browser UI (Shipped)
The cloud orchestrator is for batch work while you sleep. During the day you want to see agents, chat with them, and spawn sessions without living in a terminal.
We merged pi-agent-dashboard into the Pi monorepo — not as a second repo to install. One checkout, one command:
npm run dashboard:dev
Open http://localhost:33634. You get:
- Live session streaming — watch tool calls and model output in real time
- Interactive chat — send prompts, answer
ask_userdialogs from the browser - Session spawning — start Pi in any pinned project folder
- Cost tracking — per-session token usage when using Tinqs inference
- Plugins — flows, subagents, workspace helpers
The dashboard talks to Pi sessions over a WebSocket bridge on port 9999. Inference uses the same Tinqs proxy as the CLI — register a custom provider in ~/.pi/agent/providers.json and authenticate with your existing tstudio token. No separate LLM API keys.
Dashboard (localhost:33634)
↕ WebSocket (port 9999)
Pi sessions (interactive or headless)
↕ OpenAI-compatible API
Tinqs Studio proxy (tinqs.com/api/v1/ai)
↕ DeepSeek V4 Flash / Pro
When Studio runs locally with agents enabled, the dashboard can also talk to the orchestrator API on port 3000 — submit tasks and watch SSE events in the same UI.
One browser tab for daytime work; the orchestrator queue for overnight runs.
The Guardrail
Our biggest fear: an agent hallucinating instead of using tools, or running aws ec2 terminate-instances at 3 AM.
The guardrail extension monitors every agent turn:
Hallucination detection — if the agent claims file contents without calling read, it gets corrected.
No-tool drift — three consecutive turns without a tool call triggers a warning.
Command blocking — 29 patterns covering destructive git, AWS teardown, process killing, and production API abuse.
What It Cost to Build
A few focused sessions: about 2,000 lines of Go, 900 lines of TypeScript extensions, 52 tests, plus merging the dashboard packages into the Pi monorepo. No new servers — Pi is a Node subprocess; the dashboard is another Node process on your machine.
What Is Next
| Piece | Status |
|——-|——–|
| Pi fork + tinqs extensions | Shipped |
| Dashboard merged into Pi monorepo | Shipped |
| Go orchestrator + REST/SSE API | MVP, running locally |
| Git worktree + push + PR loop | In progress |
| Domain routing (game / sim / platform tasks) | Designed |
Next we are promoting studio skills from IDE playbooks into orchestrator prompt packs — so the same Pi worker behaves like a game builder, sim maintainer, or platform engineer depending on the task. Specialized agents (planner, reviewer, asset pipeline) sit on top of this foundation.
The harness — inference proxy, guardrails, dashboard, orchestrator API — is in place. The work now is feeding it real tasks and hardening the git loop.
Tinqs Studio is an open platform for game development — git hosting, AI inference, asset generation, and autonomous agents. We are building Ariki, a survival colony sim, using the same tools we ship.