f762ad52a3
The redaction-pattern example ironically named the real classified codename and a real internal admin URL in a public post. Genericized both to placeholders (<internal-codename>, admin.<internal-domain>) — the example still illustrates the pattern format without leaking. Rebuilt HTML. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
111 lines
5.8 KiB
Markdown
111 lines
5.8 KiB
Markdown
---
|
|
title: "A Pre-Commit Agent That Guards Your Secrets for $0.001"
|
|
slug: pre-commit-agent
|
|
date: "2026-05-25"
|
|
description: "We built a pre-commit hook that calls DeepSeek V4 Flash to review every commit. It catches leaked API keys, classified terms, broken URLs, and drafts announcements — for a tenth of a cent per commit."
|
|
og_description: "A DeepSeek-powered pre-commit hook that catches leaks for $0.001/commit."
|
|
og_image: "https://www.tinqs.com/img/og-cover.jpg"
|
|
excerpt: "Too many things to remember before hitting commit. Don't leak API keys. Don't reference classified codenames. Don't link to deleted repos. We built a two-layer pre-commit hook — regex + LLM — that catches all of it for $0.001."
|
|
author: "Ozan Bozkurt"
|
|
author_initials: "OB"
|
|
author_role: "CTO & Developer, Tinqs"
|
|
---
|
|
Every small team has the same problem: too many things to remember before `git commit`. Don't leak API keys. Don't reference the classified AI codename in public posts. Don't link to GitHub repos we deleted six months ago. Don't push a blog post with a 90-character title.
|
|
|
|
A checklist in the README doesn't work. Humans skip checklists. Code review catches some issues but not all — reviewers focus on logic, not whether a URL points to a deleted org.
|
|
|
|
We built a pre-commit hook with two layers: a regex blocklist that's instant and free, and an LLM review that costs $0.001. Together they catch everything.
|
|
|
|
## Layer 1: Regex blocklist (0ms, $0.00)
|
|
|
|
A text file of patterns, each tagged with scope and message:
|
|
|
|
```
|
|
public|\b<internal-codename>\b|Classified codename — use the public-facing alias
|
|
all|github\.com/(tinqs-ltd|tinqs)/|GitHub repos deleted — use tinqs.com
|
|
all|sk-[a-zA-Z0-9]{20,}|Possible API key leaked
|
|
all|AKIA[A-Z0-9]{16}|AWS access key leaked
|
|
public|admin\.<internal-domain>|Internal admin URL in public content
|
|
```
|
|
|
|
The scope field controls where patterns apply. `all` means every file. `public` means only public-facing content — blog posts, website, marketing pages. We *want* classified codenames in internal architecture docs. We just don't want them in blog posts.
|
|
|
|
The blocklist runs grep against the staged diff. No network call, no API, no latency. Match found → commit blocked immediately with file path and explanation. This catches 80% of issues before the LLM wakes up.
|
|
|
|
## Layer 2: DeepSeek V4 Flash review (~4s, $0.001)
|
|
|
|
If the commit touches public-facing files, the hook sends the staged diff to DeepSeek V4 Flash. The system prompt tells it exactly what to check:
|
|
|
|
- **Leaked secrets** — API keys, tokens, credentials the regex might have missed
|
|
- **Classified terms** — codenames not yet in the blocklist
|
|
- **Internal URLs** — references to services that shouldn't be public
|
|
- **Blog quality** — title length, meta description, slug consistency
|
|
- **Broken links** — malformed URLs, obvious typos
|
|
- **Announcements** — if it's a new blog post, draft a one-line summary
|
|
|
|
The model responds with structured JSON: `errors` (block) or `warnings` (inform but allow). If the API is unreachable or times out, the commit proceeds — the hook never blocks work for infrastructure reasons.
|
|
|
|
## The architecture
|
|
|
|
```
|
|
git commit
|
|
↓
|
|
Phase 0: Collect staged diff + classify files (public vs internal)
|
|
↓
|
|
Phase 1: Regex blocklist scan (instant, free)
|
|
→ Match → BLOCK
|
|
→ Clean → continue
|
|
↓
|
|
Phase 2: Public files changed?
|
|
→ No → exit 0 (skip AI review, zero cost)
|
|
→ Yes → send diff to DeepSeek V4 Flash
|
|
↓
|
|
Phase 3: Parse JSON response
|
|
→ Errors → BLOCK
|
|
→ Warnings → print, exit 0
|
|
→ Announcement → print draft
|
|
→ API failure → warn, exit 0 (never block on infra)
|
|
```
|
|
|
|
The hook lives in `.githooks/` — committed, version-controlled, shared by the team. A setup script points `git config core.hooksPath` there.
|
|
|
|
## What it costs
|
|
|
|
| | Tokens | Cost |
|
|
|--|--------|------|
|
|
| Input (prompt + diff) | ~4,000 | $0.00056 |
|
|
| Output (JSON response) | ~200 | $0.00006 |
|
|
| **Per commit** | | **$0.00062** |
|
|
|
|
A tenth of a cent. Twenty commits a day: $0.012/day. About **$0.40/month**. Commits that only touch internal files skip the AI review entirely — zero cost.
|
|
|
|
## What it caught (first week)
|
|
|
|
- **2 classified codename leaks** in draft blog posts — caught by blocklist
|
|
- **1 GitHub URL** from an old copy-paste — caught by blocklist
|
|
- **3 blog SEO warnings** — titles over 60 chars, missing og_description — caught by AI
|
|
- **1 announcement draft** auto-generated when a new post was committed
|
|
|
|
Zero false positives on the blocklist. Two false positives from the AI — flagged an internal URL in a code example that was clearly illustrative. We added a note to the prompt: ignore URLs inside fenced code blocks.
|
|
|
|
## Setup
|
|
|
|
```bash
|
|
bash scripts/setup-hooks.sh # or .\scripts\setup-hooks.ps1 on Windows
|
|
export TINQS_HOOK_TOKEN=<your-token> # same PAT used for git push
|
|
```
|
|
|
|
That's it. Every `git commit` runs the two-layer review. Bypass with `git commit --no-verify` for emergencies.
|
|
|
|
## The pattern: guard rails at the edge
|
|
|
|
This is the same principle we apply everywhere: put the guard rail where the action happens. Don't rely on a human checklist. Don't wait for code review. Don't hope someone remembers.
|
|
|
|
The pre-commit hook is $0.001 of prevention. A leaked API key in a public post is hours of rotation, revocation, and audit. A classified codename in a blog post is a confidentiality breach. A dead link is a broken experience nobody notices for weeks.
|
|
|
|
The tools exist. DeepSeek V4 Flash is cheap enough to call on every commit. The hook is 150 lines of bash. The blocklist is a text file. Total infrastructure cost: zero — it runs on the developer's machine, calls an API we already pay for, adds 4 seconds to the commit flow.
|
|
|
|
---
|
|
|
|
*The pre-commit hook is part of [Tinqs Studio](https://tinqs.com). The inference proxy, blocklist patterns, and review prompt are open and reusable. Every commit in [Ariki](https://arikigame.com) runs through the same guard.*
|