feat(cloud): harden nginx-proxy + sql foundation; HTTP-01 interim cert plan

Wire up the three foundation stacks (nginx-proxy, sql, portainer) in
cloud/compose.yml and add real configs for the first two.

nginx-proxy
- Base nginx.conf with http + stream contexts, modern TLS profile,
  client_max_body_size baseline for gitea LFS / mlflow artifacts.
- Vhosts under conf.d/: grafana, gitea, keycloak, nodered, mlflow,
  jupyter, portainer (HTTPS upstream), rabbitmq, jenkins. WebSocket
  upgrade headers where needed (grafana live, node-red editor,
  jupyterhub kernels, jenkins agents).
- conf.d/00-default.conf serves /.well-known/acme-challenge/ on :80
  and 301-redirects everything else.
- stream.d/mqtt.conf terminates MQTT-TLS at 8883, proxies to
  rabbitmq:1883 internally.
- All vhosts reference /etc/letsencrypt/live/infra/* — a stable path
  via certbot --cert-name infra, so the wildcard migration changes
  nothing in the vhost files.
- README documents: HTTP-01 SAN interim during Versio period →
  DNS-01 wildcard via certbot-dns-transip after migration; bootstrap
  procedure (self-signed fallback → real cert issuance → reload).

sql
- config/init.d/01-databases.sh provisions gitea/keycloak/mlflow
  databases + roles on first start. Idempotent only via fresh
  data volume — change the script after first run requires
  manual psql or a volume wipe.
- compose env extended with GITEA_DB_PASSWORD, KEYCLOAK_DB_PASSWORD,
  MLFLOW_DB_PASSWORD.

cloud
- include: now wires nginx-proxy + sql + portainer. Other stacks
  stay commented for future rounds.
- .env.example adds KEYCLOAK_DB_PASSWORD and sensible defaults
  (LETSENCRYPT_EMAIL, GRAFANA_ROOT_URL, KEYCLOAK_HOSTNAME,
  GITEA_ROOT_URL, POSTFIX_FROM_DOMAIN all pointing at wbd-rd.nl).
- Operator note inline: bring portainer's standalone instance down
  before deploying via cloud compose; comment its ports: block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
znetsixe
2026-05-21 13:43:35 +02:00
parent 67b37b9b2a
commit 5d95f8bfcc
19 changed files with 426 additions and 41 deletions

View File

@@ -1,47 +1,100 @@
# nginx-proxy
The single web ingress for cloud + edge. Reverse-proxies HTTPS UIs and stream-proxies MQTT-TLS to RabbitMQ. TLS certificates managed by a certbot sidecar (Let's Encrypt, HTTP-01 webroot challenge).
The single web ingress for cloud + edge. Reverse-proxies HTTPS UIs and stream-proxies MQTT-TLS to RabbitMQ. TLS certs managed by a certbot sidecar (Let's Encrypt).
- **Image**: stock `nginx:1.27-alpine` (we don't use `nginxproxy/nginx-proxy` because we need the `stream {}` context for MQTT-TLS, which that image doesn't expose cleanly)
- **Sidecar**: `certbot/certbot:latest` — renews every 12h, shared `nginx-certs` + `nginx-acme-challenge` volumes
- **Image**: stock `nginx:1.27-alpine` (we don't use `nginxproxy/nginx-proxy` because we need the `stream {}` context for MQTT-TLS)
- **Sidecar**: `certbot/certbot:latest` — renews every 12h via HTTP-01 webroot challenges
- **Networks**: `edge` (the only port-publisher) + `app` (talks to upstream services)
- **Host ports**: `tcp/80`, `tcp/443`, `tcp/8883`
## Cert strategy
**Interim (Versio DNS)**: HTTP-01 SAN cert covering all subdomains, issued via `--webroot`. Requires:
- Public DNS A records for each subdomain pointing at the cloud host
- `tcp/80` reachable from the internet
**After TransIP migration**: switch to DNS-01 wildcard (`*.wbd-rd.nl`). Swap the `certbot/certbot` image for a build that includes `certbot-dns-transip` and reissue with `--cert-name infra` so the cert path stays stable — **no vhost config changes needed**.
## Config layout
```
config/
├── nginx.conf # base config — must include `stream {}` directive
├── conf.d/ # HTTP vhosts (one per upstream UI)
│ ├── grafana.conf
│ ├── node-red.conf
│ ├── gitea.conf
── ...
├── nginx.conf # base — http + stream contexts
├── conf.d/
│ ├── 00-default.conf # port 80: ACME challenge + HTTPS redirect
│ ├── grafana.conf # grafana.wbd-rd.nl
│ ├── gitea.conf # gitea.wbd-rd.nl
── keycloak.conf # keycloak.wbd-rd.nl
│ ├── nodered.conf # nodered.wbd-rd.nl
│ ├── mlflow.conf # mlflow.wbd-rd.nl
│ ├── jupyter.conf # jupyter.wbd-rd.nl
│ ├── portainer.conf # portainer.wbd-rd.nl (HTTPS upstream)
│ ├── rabbitmq.conf # rabbitmq.wbd-rd.nl (mgmt UI)
│ └── jenkins.conf # jenkins.wbd-rd.nl
└── stream.d/
└── mqtt.conf # MQTT-TLS stream block, SNI route to rabbitmq:1883
└── mqtt.conf # mqtt.wbd-rd.nl:8883 → rabbitmq:1883
```
Volumes:
- `nginx-certs` — Let's Encrypt cert chains (`/etc/letsencrypt`), read-only mounted into nginx, writable from certbot
- `nginx-acme-challenge` — webroot for HTTP-01 challenges (`/var/www/certbot`)
- `nginx-certs` — Let's Encrypt cert chains at `/etc/letsencrypt/`; read-only into nginx, writable from certbot
- `nginx-acme-challenge` — webroot for HTTP-01 challenges at `/var/www/certbot/`
## Initial cert issuance
All vhosts reference `/etc/letsencrypt/live/infra/fullchain.pem` and `privkey.pem` — a stable path independent of the issuance method.
1. Start with HTTP-only nginx config (serving `/.well-known/acme-challenge/`).
2. Issue:
```bash
docker compose run --rm certbot certonly \
--webroot -w /var/www/certbot \
--email "$LETSENCRYPT_EMAIL" --agree-tos --no-eff-email \
-d gitea.example.com -d grafana.example.com -d nodered.example.com
```
3. Drop HTTPS vhost configs into `config/conf.d/` and reload nginx.
## First-run bootstrap
The sidecar then renews automatically.
The HTTPS server blocks won't load without a cert at `/etc/letsencrypt/live/infra/`. Bootstrap procedure (one-time):
```bash
cd stacks/nginx-proxy
# 1. Self-signed fallback so nginx starts and serves /.well-known/acme-challenge/
docker compose run --rm --entrypoint=/bin/sh nginx -c \
"mkdir -p /etc/letsencrypt/live/infra && \
openssl req -x509 -nodes -days 1 -newkey rsa:2048 \
-keyout /etc/letsencrypt/live/infra/privkey.pem \
-out /etc/letsencrypt/live/infra/fullchain.pem \
-subj '/CN=bootstrap-infra'"
# 2. Start nginx (HTTPS blocks load with the dummy cert)
docker compose up -d nginx
# 3. Issue the real cert via HTTP-01
docker compose run --rm certbot certonly \
--webroot -w /var/www/certbot \
--email "$LETSENCRYPT_EMAIL" --agree-tos --no-eff-email \
--cert-name infra \
-d grafana.wbd-rd.nl -d gitea.wbd-rd.nl -d keycloak.wbd-rd.nl \
-d nodered.wbd-rd.nl -d mlflow.wbd-rd.nl -d jupyter.wbd-rd.nl \
-d portainer.wbd-rd.nl -d rabbitmq.wbd-rd.nl -d jenkins.wbd-rd.nl \
-d mqtt.wbd-rd.nl
# 4. Reload nginx to pick up the real cert
docker compose exec nginx nginx -s reload
```
The certbot sidecar then renews every 12h automatically.
## DNS prereqs (HTTP-01)
Before bootstrap, ensure A records exist in Versio for:
```
grafana.wbd-rd.nl A <cloud-public-ip>
gitea.wbd-rd.nl A <cloud-public-ip>
keycloak.wbd-rd.nl A <cloud-public-ip>
nodered.wbd-rd.nl A <cloud-public-ip>
mlflow.wbd-rd.nl A <cloud-public-ip>
jupyter.wbd-rd.nl A <cloud-public-ip>
portainer.wbd-rd.nl A <cloud-public-ip>
rabbitmq.wbd-rd.nl A <cloud-public-ip>
jenkins.wbd-rd.nl A <cloud-public-ip>
mqtt.wbd-rd.nl A <cloud-public-ip>
```
## TODO
- Write base `config/nginx.conf` (`http` + `stream` contexts)
- Per-upstream vhost templates with OIDC `auth_request` to Keycloak
- Decide internal PKI vs Let's Encrypt for cloud-internal hostnames not reachable from the public internet
- Edge-side variant: bind to plant-LAN IP only, internal CA for plant.local hostnames
- Wildcard cert via `certbot-dns-transip` (post TransIP migration)
- OIDC `auth_request` to Keycloak in front of services without native OIDC (mlflow, portainer-CE)
- Edge-side variant: bind to plant-LAN IP, internal CA for `*.local` hostnames
- HSTS + security headers (`add_header Strict-Transport-Security ...`)