Agents That Code Overnight: Our $0.80 Cloud Harness with DeepSeek V4 and Pi
-Every coding agent today — Claude Code, Codex, Cursor, Pi — has the same limitation: it runs in your terminal. You watch it work. You close the laptop, it stops. There's no way to say "build these eight features overnight" and wake up to pull requests.
- -We built exactly that. A Pi fork for the brain, a Go orchestrator inside our Gitea platform for overnight batch work, and a browser dashboard for daytime. Here's the stack.
-The problem with terminal-only agents
-Claude Code runs on Opus at $15/MTok output. Codex uses GPT-5.5. Running eight agents overnight on either would cost $50-200. That's not sustainable for a four-person studio.
-DeepSeek V4 Flash costs $0.28/MTok output. Eight overnight tasks: about $0.80. The cost differential changes what's possible — from "I'll use this sparingly" to "run it on everything."
-But cost isn't the only issue. Cloud tools are black boxes. You can't add a Gitea API tool, a fal.ai image generator, or a guardrail that blocks destructive commands. With our own harness, you add an extension and it's live. Agents are not a feature to outsource — they're the product.
-Pi — the agent brain
-Pi is an open-source coding agent. MIT license, TypeScript, 51k stars. Four core tools (read, write, edit, bash) and an extension system. We forked it and added four extensions:
--
-
- tinqs-provider — routes DeepSeek V4 Flash/Pro through our inference proxy -
- tinqs-tools — Gitea REST API, fal.ai image generation, vision model access -
- tinqs-ci — reads CI pipeline status, logs, polls for completion -
- tinqs-guardrail — 29 safety patterns blocking dangerous commands -
Each extension is a single TypeScript file. No npm dependencies. The core Pi code is untouched — we only add files.
-Pi's RPC mode is what makes overnight automation possible. It runs headless, accepting JSON on stdin/stdout. The orchestrator spawns it as a subprocess, sends tasks, receives results. No terminal, no editor UI.
-DeepSeek V4 — the LLM
-DeepSeek V4 Flash through our own inference proxy. OpenAI-compatible API, so Pi treats it like any other provider. The proxy adds:
--
-
- Redis job queue (10 concurrent workers) -
- Per-user usage tracking -
- System prompt injection for cache-hit optimization -
- Gitea PAT authentication — same token as git push -
Cost per task: $0.02-0.10 depending on complexity.
-Go orchestrator — overnight batch work
-Inside our Gitea fork we added modules/agents/ — a Go worker pool that spawns Pi processes, tracks task lifecycle, and streams events over SSE to any connected UI. Six endpoints, same auth as git push:
POST /api/v1/agents/tasks — submit a task
-GET /api/v1/agents/tasks — list all tasks
-GET /api/v1/agents/tasks/{id} — get task details
-DELETE /api/v1/agents/tasks/{id} — stop a task
-GET /api/v1/agents/stream — SSE live events
-GET /api/v1/agents/health — orchestrator status
-The orchestrator lives in the same binary as git — same auth, no extra service to deploy. The intended loop:
-Orchestrator reads task brief
- → spawns pi --mode rpc
- → Pi writes code using DeepSeek V4
- → Pi pushes branch, calls ci_wait
- → CI green → Pi opens PR via gitea_api
- → CI red → Pi reads ci_logs, fixes, retries (≤3)
- → Human reviews PR, merges
-Browser dashboard — daytime UI
-The orchestrator is for overnight batch work. During the day, you want to see agents, chat with them, and spawn sessions — without living in a terminal.
-We merged pi-agent-dashboard into the Pi monorepo. One command:
-npm run dashboard:dev
-Open localhost:33634 and you get live session streaming (watch tool calls and model output in real time), interactive chat, session spawning in any project folder, and per-session cost tracking. The dashboard talks to Pi sessions over WebSocket on port 9999. Inference uses the same Tinqs proxy as the CLI — one API key, one billing account.
Browser (localhost:33634)
- ↕ WebSocket (port 9999)
-Pi sessions (interactive or headless)
- ↕ OpenAI-compatible API
-Tinqs proxy (tinqs.com/api/v1/ai)
- ↕ DeepSeek V4 Flash / Pro
-The guardrail
-The biggest fear with autonomous agents: hallucination. An agent claiming it read a file without calling read. Three consecutive turns with no tool calls. Running aws ec2 terminate-instances at 3am.
The guardrail extension monitors every turn:
--
-
- Hallucination detection — claims without tool calls get corrected -
- No-tool drift — three turns with zero tool calls triggers a warning -
- Command blocking — 29 patterns covering destructive git, AWS teardown, process killing, production API abuse -
Guardrails at the platform layer, not the prompt layer. Prompts can be ignored. Platform gates cannot.
-What it cost to build
-About 2,000 lines of Go, 900 lines of TypeScript extensions, 52 tests, plus merging the dashboard into the Pi monorepo. No new servers — Pi is a Node subprocess; the dashboard is another Node process on your machine. The orchestrator is a Go module inside our existing Gitea binary — zero additional infrastructure.
--
The harness — inference proxy, guardrails, dashboard, orchestrator API — is in place. Agents code while you sleep for pocket change. And because everything runs on your own infrastructure, you control the models, the tools, and the safety rails.
-Tinqs Studio is an open platform for game development — git hosting, AI inference, asset generation, and autonomous agents. We're building Ariki using the same tools.
- -