--- title: "Running AI Agents from Your Browser" slug: local-agent-dashboard date: "2026-05-26" description: "A local dashboard at localhost:33634 turns AI agent tasks into a job queue. Handoffs become jobs, agents code and test and deliver PRs, all visible from your browser." og_description: "A local dashboard turns AI agent sessions into trackable jobs. Handoffs, PR delivery, and $0.02/task." og_image: "https://www.tinqs.com/img/og-cover.jpg" excerpt: "A local dashboard at localhost:33634 turns AI agent sessions into trackable jobs with handoffs, code generation, testing, and PR delivery — all visible from your browser." author: "Ozan Bozkurt" author_initials: "OB" author_role: "CTO & Developer, Tinqs" --- We built a local dashboard that turns AI agent sessions into a job queue you can watch from your browser. Open localhost:33634, type a task, and watch the agent plan, code, test, and deliver a PR --- while you work on something else. ## The Problem: Agent Sessions Are Ephemeral When you use an AI coding agent in your terminal, you're staring at a scrolling wall of text. The agent thinks, calls tools, writes files, runs tests. If you look away for 30 seconds, you miss something. If you have three tasks queued up, you can't see which one is running, which one is blocked, and which one shipped a PR twenty minutes ago. The terminal is great for interactive sessions --- pair-programming with the agent, asking questions, iterating on a design. It's terrible for **batch work**. When you have five issues to triage, two features to implement, and a PR to review, you don't want to shepherd each one through a terminal session. You want to hand them off and check back later. So we built a dashboard. ## What the Dashboard Is The dashboard runs locally at `localhost:33634`. It's a web UI that sits on top of your local coding agent, turning each task into a visible, trackable job. Open it in any browser, type a task, and the agent goes to work. The dashboard shows: - **Active sessions** --- which agent sessions are running, what they're doing right now, live streaming their output - **Job queue** --- tasks you've submitted, in order, with status: queued, running, done, failed - **Session history** --- past sessions with full logs, file diffs, and cost breakdowns - **Git state** --- which branch each session is on, what files changed, whether there's uncommitted work It's the difference between `ps aux | grep agent` and a proper job manager. ## How Handoffs Become Jobs The most important design decision: **every handoff is a job.** In terminal mode, you hand off to the agent by typing a prompt and watching. In dashboard mode, you type a prompt in the browser and the agent accepts it as a discrete job. A handoff might be "fix the null pointer in `VegetationGrid.ActivateChunk()` and add a unit test." The dashboard creates a new session for that task, spawns the agent on a fresh git worktree, and lets it run. You don't need to watch. You get a notification when it's done. Here's what happens under the hood: 1. You submit a task through the dashboard UI 2. The dashboard creates a new agent session with the task as the opening prompt 3. The agent reads the codebase, plans the change, and starts working 4. Tool calls --- file reads, writes, bash commands --- stream to the dashboard in real time 5. When the agent is done, it summarizes what changed and proposes next steps 6. The job status updates: done, failed, or needs human review Multiple jobs can run in parallel on different git worktrees. One agent fixing a UI bug on `tree-ui-fix`, another implementing a save system on `tree-save-system`. No conflicts, no cross-contamination. Each job has its own branch, its own diff, its own results. ## The Agent's Job: Code, Test, Deliver We've tuned the agent to follow a specific workflow on every task: **Read first.** Before writing a single line, the agent reads the files it needs to understand. It traces call chains, checks imports, reads test files. This sounds obvious, but most AI coding tools jump straight to writing code on incomplete context. Our agent is instructed to load context first, then act. **Test as it goes.** After making a change, the agent runs the relevant tests immediately. Not "run all tests at the end." Not "assume it works and move on." Every change gets verified before the next one. If tests fail, the agent diagnoses and fixes before proceeding. **Git discipline.** Every job creates a branch off main. Commits are small and descriptive. The agent writes proper commit messages --- not "fix stuff" but "fix null pointer in VegetationGrid.ActivateChunk when chunk is out of bounds." If the task spans multiple logical changes, the agent splits them into multiple commits. **PR delivery.** When the agent is done, it pushes the branch and opens a pull request with a description of what changed, why, and what was tested. The human reviews and merges. The agent never merges its own code. This workflow --- read, code, test, commit, PR --- runs end-to-end without human intervention. The human's job is to review the PR, not to babysit the agent. ## The Task Lifecycle Every task in the dashboard moves through a defined lifecycle: | Stage | What's happening | |-------|-----------------| | **Queued** | Task is in the queue, waiting for a free agent slot | | **Planning** | Agent is reading the codebase, understanding the problem, forming a plan | | **Working** | Agent is writing code, running tests, iterating on fixes | | **Testing** | Agent is running the full test suite, verifying nothing broke | | **Delivering** | Agent is committing, pushing, and opening a PR | | **Done** | Task completed successfully, PR is open for review | | **Failed** | Something went wrong --- tests didn't pass, agent couldn't fix, PR rejected | | **Needs Human** | Agent hit an ambiguity and is asking for clarification | You can see at a glance where every task is. No scrolling through terminal output trying to figure out if the agent is still working or stuck in a loop. ## Costs: $0.02 to $0.10 per Task The economics of AI agents change when you're running them in batch mode. We track cost per task obsessively, and the numbers are surprisingly low. Most tasks cost between **$0.02 and $0.10** in API credits. Here's the breakdown for a typical implementation task: - **Planning phase** (~3,000 tokens in, ~1,000 tokens out): $0.005 - **Code phase** (~8,000 tokens in, ~3,000 tokens out): $0.015 - **Test/fix iteration** (~5,000 tokens in, ~2,000 tokens out): $0.010 - **PR writeup** (~2,000 tokens in, ~1,000 tokens out): $0.005 Total: ~$0.035 for a small bug fix with a test. Larger features with multiple iterations run $0.08--0.15. Triage and investigation tasks (read-only, no code changes) are the cheapest at $0.01--0.02. The key to keeping costs down: **use the right model for the right phase.** We run DeepSeek V4 for coding tasks --- it's fast, cheap, and excellent at structured code generation. For tasks that require visual understanding --- screenshots, UI review, diagram analysis --- we route to Gemini Flash, which handles images at a fraction of the cost of other vision-capable models. Model selection is automatic. The dashboard routes based on the task type. You don't need to think about it. ## Setup in 5 Minutes Getting the dashboard running takes longer to read about than to do: 1. **Install the agent CLI.** One command, one binary. Works on Mac, Windows, and Linux. 2. **Start the dashboard.** `agent dashboard` --- that's it. It binds to `localhost:33634`. 3. **Open your browser.** Navigate to the URL, and you'll see an empty job queue. 4. **Type your first task.** "Add input validation to the login form and write tests." Press enter. 5. **Watch it work.** The agent spawns, reads the codebase, and starts coding. You'll see tool calls stream in real time. That's the whole setup. No Docker, no Kubernetes, no cloud configuration, no API gateway. One binary, one port, one browser tab. If you already have an API key configured for your coding agent, the dashboard picks it up automatically. Advanced configuration --- custom models, thinking levels, parallel job limits, git worktree locations --- is all done through a config file. But the defaults work for 90% of use cases. You can go from zero to running your first agent job in the time it takes to brew a coffee. ## What We've Learned **The browser is a better job manager than the terminal.** This sounds obvious in retrospect, but it took us months to realize. The terminal is for interaction. The browser is for monitoring. When you have multiple agents running, you want tabs, not tmux panes. **Batch mode changes how you think about agents.** In interactive mode, you ask the agent to do one thing and wait. In batch mode, you queue up everything you need and come back to a pile of open PRs. The mental model shifts from "assistant" to "worker pool." **Cheap models are good enough for most work.** DeepSeek V4 handles 80% of our coding tasks with quality indistinguishable from premium models, at a tenth of the cost. The remaining 20% --- complex refactors, architectural decisions, anything requiring deep reasoning --- still benefit from premium models. But routing every task to the most expensive model is like commuting in a Formula 1 car. **Visibility reduces trust issues.** When you can't see what the agent is doing, you don't trust the output. The dashboard's live streaming of tool calls --- every file read, every command run, every test result --- builds confidence. You can see the agent's reasoning, not just its conclusions. **Git worktrees are the missing primitive.** Without them, you can only run one agent job at a time --- or you get file conflicts. With them, each job gets its own isolated workspace on a dedicated branch. Zero coordination overhead, zero merge conflicts between concurrent jobs. When the agent is done, the worktree is cleaned up automatically. --- The dashboard is part of how we use AI agents in our own development. It's not a product we're selling --- it's the tool we built because we needed it, and it's available to anyone using the agent. If you're running AI coding agents and you're still staring at terminal output for every task, give the dashboard a try. Open `localhost:33634`, queue up your backlog, and watch the PRs roll in. We're building all of this as part of [Tinqs Studio](https://tinqs.com) --- a game development platform that brings together git hosting, AI agent tools, and creative workflows for game teams.