Files
ci/DEVOPS.md
T
ozan 501953c636 tinqs/ci — composite actions + Lambda dispatcher for Spot CI runners
Actions: checkout, setup-go, setup-node, setup-aws
Dispatcher: Lambda → EC2 Spot (ephemeral, self-terminating)
Images: base, go, node, docker, deploy, godot

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 01:37:55 +01:00

4.1 KiB

DevOps Reference

AWS Resources (eu-west-1)

Resource Name/ID Purpose
Lambda tinqs-ci-dispatch Webhook handler + Spot launcher
DynamoDB tinqs-ci-runs Run tracking (repo, run_id, instance_id, status)
AMI tinqs-ci-runner-v2 (ami-00a129385002e4de9) Pre-baked runner (Go, Node, Docker, act_runner)
Security Group sg-030bf74b43d3faac7 Runner SG (outbound HTTPS)
Subnet subnet-04b5aeec9bfc4ec2c Default VPC subnet
Instance Profile tinqs-ci-runner IAM role (S3, ECR, ECS, SSM)
CloudWatch /aws/lambda/tinqs-ci-dispatch Dispatcher logs
ECS Cluster tinqs-git Platform (Gitea) — NOT for CI runners
EFS tinqs-git-repos (fs-03f3fb4859ceb12a3) Gitea repo storage — NOT for CI

Deleted resources (26 May 2026)

Resource Why deleted
Lambda tinqs-ci-exec Never successfully ran a build. Deploy jobs go through Spot now.
CloudWatch /aws/lambda/tinqs-ci-exec Log group for deleted Lambda
CloudWatch /ecs/tinqs-runner From Fargate era, no longer used

Webhook flow

Gitea (tinqs.com)
  └─ per-repo webhook on push
       └─ POST https://<api-gw>/dispatch
            └─ Lambda tinqs-ci-dispatch
                 ├─ Fetch .gitea/workflows/*.yml via Gitea API
                 ├─ Evaluate triggers (branch + path filters)
                 ├─ For each matched workflow:
                 │    ├─ Read runs-on label
                 │    └─ RunInstances (Spot, ephemeral)
                 └─ Track in DynamoDB

Spot instance lifecycle

1. Lambda calls RunInstances (Spot, InstanceInitiatedShutdownBehavior=terminate)
2. User-data runs:
   a. Configure git auth (url.insteadOf with GITEA_TOKEN)
   b. act_runner register --ephemeral --labels <label>:host
   c. act_runner daemon (blocks until job completes)
   d. EXIT trap fires → shutdown -h now → instance terminates
3. DynamoDB record: running → completed (or timeout after 30 min cleanup)

Cleanup cron

The dispatcher Lambda also handles cleanup when invoked with empty body or {"action":"cleanup"}. Should be triggered by EventBridge every 5 minutes.

  • Scans DynamoDB for runs older than 30 min with status=running
  • Terminates matching EC2 instances
  • Sweeps for orphan instances (tagged tinqs-ci, running > 30 min)

Cost

Component Estimated monthly cost
Spot instances (t3.small, ~10 min/build, ~5 builds/day) ~$1-2
Lambda (< 1000 invocations/month) ~$0 (free tier)
DynamoDB (< 1 GB, low RCU/WCU) ~$0 (free tier)
CloudWatch logs ~$0.50
Total CI ~$2-3/month

Common operations

Rotate GITEA_TOKEN

  1. Generate new token in Gitea: Settings → Applications → Generate Token
  2. Update Lambda env: aws lambda update-function-configuration --function-name tinqs-ci-dispatch --environment ...
  3. Old token is burned into running instances — they'll die within 30 min

Rotate RUNNER_TOKEN

  1. Gitea admin → Actions → Runners → Create new registration token
  2. Update Lambda env var
  3. Running instances keep their existing registration until they die

Build a new AMI

# Launch from current AMI
aws ec2 run-instances --image-id ami-00a129385002e4de9 \
  --instance-type t3.small --key-name <your-key> \
  --region eu-west-1 --query 'Instances[0].InstanceId'

# SSH in, update tools
ssh ec2-user@<ip>
sudo yum update -y
# Install/update Go, Node, Docker, act_runner as needed

# Create new AMI
aws ec2 create-image --instance-id <id> --name tinqs-ci-runner-v3

# Update Lambda
aws lambda update-function-configuration --function-name tinqs-ci-dispatch \
  --environment "Variables={...,RUNNER_AMI=ami-NEW,...}"

# Terminate build instance
aws ec2 terminate-instances --instance-id <id>

Add CI to a new repo

  1. Create .gitea/workflows/<name>.yml in the repo
  2. Add per-repo webhook in Gitea: Settings → Webhooks → Add Webhook
    • URL: Lambda API Gateway URL
    • Events: Push
    • Content type: application/json
  3. Push a change that matches the workflow trigger