Files
ci/DEVOPS.md
T
ozan 501953c636 tinqs/ci — composite actions + Lambda dispatcher for Spot CI runners
Actions: checkout, setup-go, setup-node, setup-aws
Dispatcher: Lambda → EC2 Spot (ephemeral, self-terminating)
Images: base, go, node, docker, deploy, godot

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 01:37:55 +01:00

116 lines
4.1 KiB
Markdown

# DevOps Reference
## AWS Resources (eu-west-1)
| Resource | Name/ID | Purpose |
|----------|---------|---------|
| Lambda | `tinqs-ci-dispatch` | Webhook handler + Spot launcher |
| DynamoDB | `tinqs-ci-runs` | Run tracking (repo, run_id, instance_id, status) |
| AMI | `tinqs-ci-runner-v2` (ami-00a129385002e4de9) | Pre-baked runner (Go, Node, Docker, act_runner) |
| Security Group | sg-030bf74b43d3faac7 | Runner SG (outbound HTTPS) |
| Subnet | subnet-04b5aeec9bfc4ec2c | Default VPC subnet |
| Instance Profile | tinqs-ci-runner | IAM role (S3, ECR, ECS, SSM) |
| CloudWatch | /aws/lambda/tinqs-ci-dispatch | Dispatcher logs |
| ECS Cluster | tinqs-git | Platform (Gitea) — NOT for CI runners |
| EFS | tinqs-git-repos (fs-03f3fb4859ceb12a3) | Gitea repo storage — NOT for CI |
## Deleted resources (26 May 2026)
| Resource | Why deleted |
|----------|-------------|
| Lambda `tinqs-ci-exec` | Never successfully ran a build. Deploy jobs go through Spot now. |
| CloudWatch `/aws/lambda/tinqs-ci-exec` | Log group for deleted Lambda |
| CloudWatch `/ecs/tinqs-runner` | From Fargate era, no longer used |
## Webhook flow
```
Gitea (tinqs.com)
└─ per-repo webhook on push
└─ POST https://<api-gw>/dispatch
└─ Lambda tinqs-ci-dispatch
├─ Fetch .gitea/workflows/*.yml via Gitea API
├─ Evaluate triggers (branch + path filters)
├─ For each matched workflow:
│ ├─ Read runs-on label
│ └─ RunInstances (Spot, ephemeral)
└─ Track in DynamoDB
```
## Spot instance lifecycle
```
1. Lambda calls RunInstances (Spot, InstanceInitiatedShutdownBehavior=terminate)
2. User-data runs:
a. Configure git auth (url.insteadOf with GITEA_TOKEN)
b. act_runner register --ephemeral --labels <label>:host
c. act_runner daemon (blocks until job completes)
d. EXIT trap fires → shutdown -h now → instance terminates
3. DynamoDB record: running → completed (or timeout after 30 min cleanup)
```
## Cleanup cron
The dispatcher Lambda also handles cleanup when invoked with empty body or `{"action":"cleanup"}`. Should be triggered by EventBridge every 5 minutes.
- Scans DynamoDB for runs older than 30 min with status=running
- Terminates matching EC2 instances
- Sweeps for orphan instances (tagged tinqs-ci, running > 30 min)
## Cost
| Component | Estimated monthly cost |
|-----------|----------------------|
| Spot instances (t3.small, ~10 min/build, ~5 builds/day) | ~$1-2 |
| Lambda (< 1000 invocations/month) | ~$0 (free tier) |
| DynamoDB (< 1 GB, low RCU/WCU) | ~$0 (free tier) |
| CloudWatch logs | ~$0.50 |
| **Total CI** | **~$2-3/month** |
## Common operations
### Rotate GITEA_TOKEN
1. Generate new token in Gitea: Settings → Applications → Generate Token
2. Update Lambda env: `aws lambda update-function-configuration --function-name tinqs-ci-dispatch --environment ...`
3. Old token is burned into running instances — they'll die within 30 min
### Rotate RUNNER_TOKEN
1. Gitea admin → Actions → Runners → Create new registration token
2. Update Lambda env var
3. Running instances keep their existing registration until they die
### Build a new AMI
```bash
# Launch from current AMI
aws ec2 run-instances --image-id ami-00a129385002e4de9 \
--instance-type t3.small --key-name <your-key> \
--region eu-west-1 --query 'Instances[0].InstanceId'
# SSH in, update tools
ssh ec2-user@<ip>
sudo yum update -y
# Install/update Go, Node, Docker, act_runner as needed
# Create new AMI
aws ec2 create-image --instance-id <id> --name tinqs-ci-runner-v3
# Update Lambda
aws lambda update-function-configuration --function-name tinqs-ci-dispatch \
--environment "Variables={...,RUNNER_AMI=ami-NEW,...}"
# Terminate build instance
aws ec2 terminate-instances --instance-id <id>
```
### Add CI to a new repo
1. Create `.gitea/workflows/<name>.yml` in the repo
2. Add per-repo webhook in Gitea: Settings → Webhooks → Add Webhook
- URL: Lambda API Gateway URL
- Events: Push
- Content type: application/json
3. Push a change that matches the workflow trigger