110 lines
4.9 KiB
Markdown
110 lines
4.9 KiB
Markdown
|
|
# DevOps Reference
|
|||
|
|
|
|||
|
|
[← Home](README.md) · [Architecture](Architecture.md) · [Operations](Operations.md) · [Roadmap](Roadmap.md)
|
|||
|
|
|
|||
|
|
## AWS resources (eu-west-1)
|
|||
|
|
|
|||
|
|
| Resource | Name/ID | Purpose |
|
|||
|
|
|----------|---------|---------|
|
|||
|
|
| Lambda | `tinqs-ci-dispatch` | Webhook handler + Spot launcher |
|
|||
|
|
| DynamoDB | `tinqs-ci-runs` | Run tracking (repo, run_id, instance_id, status) |
|
|||
|
|
| AMI | `tinqs-ci-runner-v2` (ami-00a129385002e4de9) | Pre-baked runner (Go, Node, Docker, act_runner) |
|
|||
|
|
| Security Group | sg-030bf74b43d3faac7 | Runner SG (outbound HTTPS) |
|
|||
|
|
| Subnet | subnet-04b5aeec9bfc4ec2c | Default VPC subnet |
|
|||
|
|
| Instance Profile | `tinqs-ci-runner` → role `tinqs-git-task` | Runner IAM role (S3, ECR, SSM) |
|
|||
|
|
| CloudWatch | /aws/lambda/tinqs-ci-dispatch | Dispatcher logs |
|
|||
|
|
| API Gateway | `q4ohxovfr8…/webhook` | Receives the per-repo Gitea push webhook |
|
|||
|
|
|
|||
|
|
### Platform host (NOT CI — context)
|
|||
|
|
|
|||
|
|
| Resource | Name/ID | Purpose |
|
|||
|
|
|----------|---------|---------|
|
|||
|
|
| EC2 | `tinqs-prod-gitea` (i-0d085288f467083e0, t3.medium) | Runs tinqs.com as a single `docker` Gitea container |
|
|||
|
|
| ALB | `tinqs-git` | Fronts the platform |
|
|||
|
|
| ECR | `tinqs-git:latest` | Platform image (built by `build.yml` → CodeBuild) |
|
|||
|
|
| RDS | `tinqs-prod` (PostgreSQL) | Platform DB |
|
|||
|
|
|
|||
|
|
The platform mounts host `/data`; `GITEA_CUSTOM=/data/gitea`, so **custom templates live at `/data/gitea/templates/`**. Template-only changes deploy here via SSM — see [Operations](Operations.md).
|
|||
|
|
|
|||
|
|
### Retired resources
|
|||
|
|
|
|||
|
|
| Resource | When / why |
|
|||
|
|
|----------|------------|
|
|||
|
|
| ECS Cluster `tinqs-git` | Deleted **2026-06-05** — platform moved to the `tinqs-prod-gitea` EC2 box |
|
|||
|
|
| EFS `tinqs-git-repos` | Retired in the 2026-06-05 EC2 migration (repos now on instance `/data`) |
|
|||
|
|
| Lambda `tinqs-ci-exec` | Deleted **26 May 2026** — never ran a build; deploy jobs go through Spot now |
|
|||
|
|
| CloudWatch `/aws/lambda/tinqs-ci-exec`, `/ecs/tinqs-runner` | Log groups for the above / the Fargate era |
|
|||
|
|
| Fargate runner service | Scaled to 0 then removed |
|
|||
|
|
|
|||
|
|
## Webhook flow
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
Gitea (tinqs.com)
|
|||
|
|
└─ per-repo webhook on push
|
|||
|
|
└─ POST https://<api-gw>/webhook
|
|||
|
|
└─ Lambda tinqs-ci-dispatch
|
|||
|
|
├─ Fetch .gitea/workflows/*.yml via Gitea API
|
|||
|
|
├─ Evaluate triggers (branch + path filters)
|
|||
|
|
├─ For each matched workflow:
|
|||
|
|
│ ├─ Read runs-on label
|
|||
|
|
│ └─ RunInstances (Spot, ephemeral) [host → skipped]
|
|||
|
|
└─ Track in DynamoDB
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## Spot instance lifecycle
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
1. Lambda calls RunInstances (Spot, InstanceInitiatedShutdownBehavior=terminate)
|
|||
|
|
2. User-data runs:
|
|||
|
|
a. Configure git auth (url.insteadOf with GITEA_TOKEN)
|
|||
|
|
b. act_runner register --ephemeral --labels <label>:host
|
|||
|
|
c. act_runner daemon (blocks until job completes)
|
|||
|
|
d. EXIT trap fires → shutdown -h now → instance terminates
|
|||
|
|
3. DynamoDB record: running → completed (or timeout after 30 min cleanup)
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Offline runners listed in Gitea admin → Actions → Runners are **normal** — they're spent ephemeral registrations, not a fault.
|
|||
|
|
|
|||
|
|
## Cleanup cron
|
|||
|
|
|
|||
|
|
The dispatcher Lambda also handles cleanup when invoked with an empty body or `{"action":"cleanup"}`. Triggered by EventBridge every 5 minutes.
|
|||
|
|
|
|||
|
|
- Scans DynamoDB for runs older than 30 min with `status=running`
|
|||
|
|
- Terminates matching EC2 instances
|
|||
|
|
- Sweeps for orphan instances (tagged `tinqs-ci`, running > 30 min)
|
|||
|
|
|
|||
|
|
## Lambda env vars
|
|||
|
|
|
|||
|
|
Configured in the AWS console, not in code:
|
|||
|
|
|
|||
|
|
| Var | Purpose |
|
|||
|
|
|-----|---------|
|
|||
|
|
| `GITEA_URL` | `https://tinqs.com` |
|
|||
|
|
| `GITEA_TOKEN` | API token — fetches workflows AND provides runner git auth |
|
|||
|
|
| `RUNNER_TOKEN` | act_runner registration token (Gitea admin → Runners) |
|
|||
|
|
| `RUNNER_AMI` | Pre-baked AMI ID |
|
|||
|
|
| `SUBNET` | VPC subnet for Spot instances |
|
|||
|
|
| `SECURITY_GROUP` | SG allowing outbound HTTPS |
|
|||
|
|
| `DDB_TABLE` | DynamoDB run-tracking table (`tinqs-ci-runs`) |
|
|||
|
|
| `INSTANCE_PROFILE` | IAM instance profile for runners |
|
|||
|
|
|
|||
|
|
## Runner IAM role (`tinqs-git-task`)
|
|||
|
|
|
|||
|
|
Inline policies of note:
|
|||
|
|
|
|||
|
|
- `tinqs-ci-s3` — R/W on `tinqs-cli-releases`, `arikigame-com-website`, `docs.tinqs.com` *(corrected 2026-06-07: was the non-existent `arikigame.com`, which broke the arikigame deploy)*
|
|||
|
|
- `tinqs-git-s3` — R/W on `tinqs-git-lfs`, `tinqs-git-preview`
|
|||
|
|
- `tinqs-ci-deploy` — ECR push, CloudFront `CreateInvalidation`, (legacy ECS update)
|
|||
|
|
- `tinqs-ci-ssm-deploy` — `ec2:DescribeInstances` + `ssm:SendCommand` **scoped to the `tinqs-prod-gitea` instance** (added 2026-06-07 for template deploys)
|
|||
|
|
- `ssm-exec` — Session Manager channels · `ec2-self-terminate` — terminate own `tinqs-ci`-tagged instance
|
|||
|
|
|
|||
|
|
## Cost
|
|||
|
|
|
|||
|
|
| Component | Estimated monthly cost |
|
|||
|
|
|-----------|----------------------|
|
|||
|
|
| Spot instances (t3.small, ~10 min/build, ~5 builds/day) | ~$1–2 |
|
|||
|
|
| Lambda (< 1000 invocations/month) | ~$0 (free tier) |
|
|||
|
|
| DynamoDB (< 1 GB, low RCU/WCU) | ~$0 (free tier) |
|
|||
|
|
| CloudWatch logs | ~$0.50 |
|
|||
|
|
| **Total CI** | **~$2–3/month** |
|