Files
ozan 33f967e42e docs: convert ci docs to the in-repo wiki/ standard + fix stale ECS facts
Adopt the team wiki convention (in-repo wiki/ folder, plain markdown) used in
tinqs/studio. Convert DEVOPS.md + PLAN.md and the heavy parts of README.md
into cross-linked wiki pages: Home, Architecture, DevOps-Reference,
Operations, Roadmap. Root README slimmed to a repo intro pointing at wiki/.

Corrects stale topology while converting:
- ECS cluster tinqs-git / EFS tinqs-git-repos retired 2026-06-05; platform now
  the standalone EC2 box tinqs-prod-gitea (ALB tinqs-git, ECR image, RDS).
- Records this session's fixes: deploy-label dry-run route, runner-name
  collisions, arikigame IAM bucket, and template deploy repointed ECS→EC2/SSM.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 20:43:05 +01:00

34 lines
1.8 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Roadmap
[← Home](README.md) · [Architecture](Architecture.md) · [DevOps Reference](DevOps-Reference.md) · [Operations](Operations.md)
## Done
- [x] Composite actions: `checkout`, `setup-go`, `setup-node`, `setup-aws`
- [x] Lambda dispatcher with Spot instance routing by `runs-on` label
- [x] Ephemeral runners (one job, self-terminate)
- [x] Git auth for private repos (`url.insteadOf`)
- [x] DynamoDB run tracking + cleanup cron
- [x] Runner image Dockerfiles: base, go, node, docker, deploy, godot
- [x] Zombie runner incident resolved (25 May 2026)
- [x] `deploy`-label jobs routed through Spot (was dead-Lambda dry-run) (07 Jun 2026)
- [x] Unique Spot runner names per dispatch (07 Jun 2026)
- [x] Template deploy repointed off deleted ECS → EC2 via SSM (07 Jun 2026)
## Next
| Priority | Task | Impact |
|----------|------|--------|
| P1 | Pre-warm Go module + build cache in the AMI | 30s build time |
| P1 | Automate AMI build (Packer or script) | Repeatable, no manual SSH |
| P2 | Internal DNS for git clones | Faster than public HTTPS |
| P2 | CloudWatch agent on the runner AMI | Persistent logs after instance death |
| P3 | `tinqs/ci/deploy-s3` action | S3 sync + CloudFront invalidation wrapper |
| P3 | `tinqs/ci/deploy-ssm` action | Reusable SSM-to-prod deploy (generalise the template-deploy step) |
| P3 | `tinqs/ci/notify` action | Post build status to GChat |
## Watch / cleanup
- **Repo size** — `tinqs/studio` now commits the arikigame site assets (~75 MB) as regular files because the CI `checkout` does no `git lfs pull`. If this grows, add `git lfs pull` to the checkout action, then LFS-track `web/arikigame/public/img/**`.
- **DEVOPS doc drift** — keep this wiki current when AWS topology changes (the ECS→EC2 move went unnoticed in docs for two days and broke deploys).