Files
infra/docs/architecture.md

174 lines
8.5 KiB
Markdown
Raw Normal View History

# Architecture
R&D infrastructure for Waterschap Brabantse Delta. Hub-and-spoke topology:
- **Cloud** layer — central services, one deployment, internet-facing.
- **Edge** layer — per-plant, plant-LAN-facing, tunneled to cloud via WireGuard.
- **OT** layer — per-plant, behind edge. Managed **outside** this repo.
```
Internet
┌───────────┴───────────┐
│ tcp/80, 443, 8883 │
│ udp/51820 │
▼ │
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
┌────────────────────────────────────┐
│ Cloud (central, one) │
│ nginx + certbot ◀── 80/443/8883 │
│ wireguard-server ◀── 51820/udp │
│ gitea, jenkins, keycloak, ... │
│ influxdb, grafana, node-red │
│ rabbitmq, postfix, portainer │
│ sql (postgres, single config) │
│ mlflow, jupyterhub │
feat(cloud): single-shot deploy.sh + FROST stack + healthchecks Stage 5 — make the cloud composition spin up in one command and add the SensorThings (FROST) stack as a fully segregated tenant. cloud/deploy.sh — idempotent, 7-step bring-up: preflight → validate → up + wait → cert state → issue/renew → service status → endpoint smoke test. Reissues LE cert only when current issuer no longer matches ACME_CA_URI. Move-aside-then- restore-on-failure so the bootstrap cert survives a failed certbot. stacks/frost — new stack, segregated from shared sql/rabbitmq: - dedicated postgis container (frost-db) - dedicated internal mosquitto bus (frost-mosquitto) - frost-http + frost-mqtt on a private frost-internal network, joined to cloud-app only for nginx ingress at frost.wbd-rd.nl - shared mosquitto stack deleted; rabbitmq remains the only public MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy) stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate on service_healthy via cloud-level depends_on overrides. stacks/nginx-proxy: - nginx-init service generates a self-signed bootstrap cert on fresh deploy so nginx starts before certbot has issued a real one - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080, /mqtt → frost-mqtt:9876 WebSocket) stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the official image can speak to the shared sql backend. stacks/jupyterhub — DummyAuthenticator stub gated by JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner. stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs (management + mqtt plugins, MQTT auth required). stacks/portainer — ports unpublished; nginx now the only ingress. stacks/node-red — pin to 4.1 (the floating "4" tag does not exist). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00
│ frost (SensorThings API) │
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
└───────────────┬────────────────────┘
│ WireGuard tunnels
┌───────┼────────┬───────────┐
▼ ▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
│ Edge: │ │ Edge: │ │ Edge: │ │ ... │
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
│gemaal1│ │ ... │ │ ... │ │ │
└───┬───┘ └───────┘ └───────┘ └───────┘
│ TLS
┌──────────┐
│ OT │ ← out of scope of this repo
│ OPCUA │
│ PLC │
└──────────┘
```
## Network topology (per layer)
Each layer uses **four internal Docker networks**:
| Network | Purpose | Notes |
|---|---|---|
| `edge` | Outermost. Cloud: internet-facing. Edge: plant-LAN-facing. | Only port-publishers join. |
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
| `app` | Application / automation tier. | Default landing for app services. |
| `data` | Databases (influxdb, sql). | `internal: true` — no internet egress. |
| `mgmt` | Identity, control plane (portainer, keycloak admin, wireguard mgmt). | Restricted. |
### Cloud attachments
```
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
edge : nginx, wireguard-server
app : nginx, rabbitmq, postfix, node-red, grafana,
feat(cloud): single-shot deploy.sh + FROST stack + healthchecks Stage 5 — make the cloud composition spin up in one command and add the SensorThings (FROST) stack as a fully segregated tenant. cloud/deploy.sh — idempotent, 7-step bring-up: preflight → validate → up + wait → cert state → issue/renew → service status → endpoint smoke test. Reissues LE cert only when current issuer no longer matches ACME_CA_URI. Move-aside-then- restore-on-failure so the bootstrap cert survives a failed certbot. stacks/frost — new stack, segregated from shared sql/rabbitmq: - dedicated postgis container (frost-db) - dedicated internal mosquitto bus (frost-mosquitto) - frost-http + frost-mqtt on a private frost-internal network, joined to cloud-app only for nginx ingress at frost.wbd-rd.nl - shared mosquitto stack deleted; rabbitmq remains the only public MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy) stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate on service_healthy via cloud-level depends_on overrides. stacks/nginx-proxy: - nginx-init service generates a self-signed bootstrap cert on fresh deploy so nginx starts before certbot has issued a real one - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080, /mqtt → frost-mqtt:9876 WebSocket) stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the official image can speak to the shared sql backend. stacks/jupyterhub — DummyAuthenticator stub gated by JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner. stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs (management + mqtt plugins, MQTT auth required). stacks/portainer — ports unpublished; nginx now the only ingress. stacks/node-red — pin to 4.1 (the floating "4" tag does not exist). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00
jenkins, gitea, keycloak, mlflow, jupyterhub,
portainer, frost-http, frost-mqtt
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
data : influxdb, sql, grafana, mlflow
mgmt : portainer, keycloak, wireguard-server, jupyterhub
feat(cloud): single-shot deploy.sh + FROST stack + healthchecks Stage 5 — make the cloud composition spin up in one command and add the SensorThings (FROST) stack as a fully segregated tenant. cloud/deploy.sh — idempotent, 7-step bring-up: preflight → validate → up + wait → cert state → issue/renew → service status → endpoint smoke test. Reissues LE cert only when current issuer no longer matches ACME_CA_URI. Move-aside-then- restore-on-failure so the bootstrap cert survives a failed certbot. stacks/frost — new stack, segregated from shared sql/rabbitmq: - dedicated postgis container (frost-db) - dedicated internal mosquitto bus (frost-mosquitto) - frost-http + frost-mqtt on a private frost-internal network, joined to cloud-app only for nginx ingress at frost.wbd-rd.nl - shared mosquitto stack deleted; rabbitmq remains the only public MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy) stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate on service_healthy via cloud-level depends_on overrides. stacks/nginx-proxy: - nginx-init service generates a self-signed bootstrap cert on fresh deploy so nginx starts before certbot has issued a real one - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080, /mqtt → frost-mqtt:9876 WebSocket) stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the official image can speak to the shared sql backend. stacks/jupyterhub — DummyAuthenticator stub gated by JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner. stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs (management + mqtt plugins, MQTT auth required). stacks/portainer — ports unpublished; nginx now the only ingress. stacks/node-red — pin to 4.1 (the floating "4" tag does not exist). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00
frost-internal (private to frost stack) :
frost-db (postgis), frost-mosquitto, frost-http, frost-mqtt
```
### Edge attachments
```
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
edge : nginx ← plant-LAN-facing
app : nginx, rabbitmq, postfix, node-red,
grafana, keycloak, wireguard-client
data : influxdb, grafana
mgmt : portainer, keycloak, wireguard-client
```
## Ingress (the only ports facing outside)
### Cloud (the central host)
| Port | Container | Notes |
|---|---|---|
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
| `tcp/80` | nginx | HTTP → 301 to 443; also serves `/.well-known/acme-challenge/` for certbot |
| `tcp/443` | nginx | All HTTPS UIs; TLS termination |
| `tcp/8883` | nginx | MQTT-TLS via `stream {}` block; SNI route to `rabbitmq:1883` |
| `udp/51820` | wireguard-server | VPN tunnel ingress |
Two containers publish a total of four ports. **Everything else is invisible** from outside the host.
### Edge (per-plant gateway)
| Port | Container | Bound to |
|---|---|---|
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
| `tcp/80` | nginx | Plant-LAN interface only |
| `tcp/443` | nginx | Plant-LAN interface only |
The edge `wireguard-client` initiates outbound to the cloud — it publishes **no port**.
## TLS strategy
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
**Stock nginx + certbot sidecar** (Let's Encrypt, HTTP-01 webroot).
- Stock `nginx:1.27-alpine` — required because we use the `stream {}` context for MQTT-TLS. `nginxproxy/nginx-proxy` (the jwilder image) is HTTP/HTTPS-only and can't expose stream cleanly.
- `certbot/certbot` sidecar runs `certbot renew` every 12h. Shared `nginx-certs` + `nginx-acme-challenge` volumes coordinate cert + challenge state between the two containers.
- Initial issuance is **manual** (one-time `docker compose run --rm certbot certonly --webroot …`). Renewal is automatic.
For cloud-internal hostnames not reachable via Let's Encrypt HTTP-01, the longer-term plan is a small internal PKI (step-ca or similar) backed by `sql`. Out of scope for first deploy.
## Why segment
- **Blast radius**: a compromised node-red on `app` cannot reach influxdb on `data` unless an explicit attachment is declared. Each service's reachability is auditable from `networks:` alone.
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
- **Defense in depth**: only nginx and wireguard-server bind host ports. No accidental `0.0.0.0` exposures.
- **NIS2 / utility audit**: WBD is in scope as water-sector. Compose networks evidence segmentation at runtime and on paper.
## Special cases
### Postfix (cloud + edge)
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
Postfix is **outbound-only**. It initiates SMTP to internet MX servers but accepts no inbound. Zero ingress, no published port, no listener facing internet. Just needs egress (every container has it via host NAT).
feat(cloud): single-shot deploy.sh + FROST stack + healthchecks Stage 5 — make the cloud composition spin up in one command and add the SensorThings (FROST) stack as a fully segregated tenant. cloud/deploy.sh — idempotent, 7-step bring-up: preflight → validate → up + wait → cert state → issue/renew → service status → endpoint smoke test. Reissues LE cert only when current issuer no longer matches ACME_CA_URI. Move-aside-then- restore-on-failure so the bootstrap cert survives a failed certbot. stacks/frost — new stack, segregated from shared sql/rabbitmq: - dedicated postgis container (frost-db) - dedicated internal mosquitto bus (frost-mosquitto) - frost-http + frost-mqtt on a private frost-internal network, joined to cloud-app only for nginx ingress at frost.wbd-rd.nl - shared mosquitto stack deleted; rabbitmq remains the only public MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy) stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate on service_healthy via cloud-level depends_on overrides. stacks/nginx-proxy: - nginx-init service generates a self-signed bootstrap cert on fresh deploy so nginx starts before certbot has issued a real one - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080, /mqtt → frost-mqtt:9876 WebSocket) stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the official image can speak to the shared sql backend. stacks/jupyterhub — DummyAuthenticator stub gated by JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner. stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs (management + mqtt plugins, MQTT auth required). stacks/portainer — ports unpublished; nginx now the only ingress. stacks/node-red — pin to 4.1 (the floating "4" tag does not exist). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00
### MQTT — RabbitMQ for public traffic, dedicated mosquitto inside FROST
feat(cloud): single-shot deploy.sh + FROST stack + healthchecks Stage 5 — make the cloud composition spin up in one command and add the SensorThings (FROST) stack as a fully segregated tenant. cloud/deploy.sh — idempotent, 7-step bring-up: preflight → validate → up + wait → cert state → issue/renew → service status → endpoint smoke test. Reissues LE cert only when current issuer no longer matches ACME_CA_URI. Move-aside-then- restore-on-failure so the bootstrap cert survives a failed certbot. stacks/frost — new stack, segregated from shared sql/rabbitmq: - dedicated postgis container (frost-db) - dedicated internal mosquitto bus (frost-mosquitto) - frost-http + frost-mqtt on a private frost-internal network, joined to cloud-app only for nginx ingress at frost.wbd-rd.nl - shared mosquitto stack deleted; rabbitmq remains the only public MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy) stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate on service_healthy via cloud-level depends_on overrides. stacks/nginx-proxy: - nginx-init service generates a self-signed bootstrap cert on fresh deploy so nginx starts before certbot has issued a real one - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080, /mqtt → frost-mqtt:9876 WebSocket) stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the official image can speak to the shared sql backend. stacks/jupyterhub — DummyAuthenticator stub gated by JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner. stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs (management + mqtt plugins, MQTT auth required). stacks/portainer — ports unpublished; nginx now the only ingress. stacks/node-red — pin to 4.1 (the floating "4" tag does not exist). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00
- **RabbitMQ** is the **only public MQTT broker**. SCADA / IoT / edge clients connect to `mqtt.wbd-rd.nl:8883` (TLS, via nginx `stream {}` block proxying to `rabbitmq:1883`). Authentication uses the standard RABBITMQ_USER/PASS.
- **frost-mosquitto** lives **inside the frost stack** on the private `frost-internal` docker network — it is purely the message bus between `frost-http` and `frost-mqtt`. It is not reachable from anywhere outside the frost stack.
- SensorThings-protocol MQTT (the FROST native MQTT API) is exposed to clients via `frost-mqtt`'s WebSocket port, proxied as `https://sta.wbd-rd.nl/mqtt`.
feat(cloud): single-shot deploy.sh + FROST stack + healthchecks Stage 5 — make the cloud composition spin up in one command and add the SensorThings (FROST) stack as a fully segregated tenant. cloud/deploy.sh — idempotent, 7-step bring-up: preflight → validate → up + wait → cert state → issue/renew → service status → endpoint smoke test. Reissues LE cert only when current issuer no longer matches ACME_CA_URI. Move-aside-then- restore-on-failure so the bootstrap cert survives a failed certbot. stacks/frost — new stack, segregated from shared sql/rabbitmq: - dedicated postgis container (frost-db) - dedicated internal mosquitto bus (frost-mosquitto) - frost-http + frost-mqtt on a private frost-internal network, joined to cloud-app only for nginx ingress at frost.wbd-rd.nl - shared mosquitto stack deleted; rabbitmq remains the only public MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy) stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate on service_healthy via cloud-level depends_on overrides. stacks/nginx-proxy: - nginx-init service generates a self-signed bootstrap cert on fresh deploy so nginx starts before certbot has issued a real one - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080, /mqtt → frost-mqtt:9876 WebSocket) stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the official image can speak to the shared sql backend. stacks/jupyterhub — DummyAuthenticator stub gated by JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner. stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs (management + mqtt plugins, MQTT auth required). stacks/portainer — ports unpublished; nginx now the only ingress. stacks/node-red — pin to 4.1 (the floating "4" tag does not exist). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00
If FROST consumers also need to see SCADA traffic on RabbitMQ, add a RabbitMQ `shovel` plugin pointing into the frost stack. Not wired up by default.
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
### Gitea — HTTPS only
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
No SSH ingress. `GITEA__server__DISABLE_SSH=true`. All clones over HTTPS via nginx-proxy. Re-evaluate only if Gitea Actions runners require SSH push.
### WireGuard server (cloud)
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
WireGuard is connectionless UDP with crypto-routed packets. Proxying through nginx-stream breaks NAT/MTU and adds no security benefit. The server publishes `udp/51820` directly — the **only** non-nginx public ingress on cloud.
## Stacks
The repo defines **15 stacks** under `stacks/`:
- **Cloud + edge**: `nginx-proxy`, `node-red`, `influxdb`, `grafana`, `keycloak`, `portainer`, `rabbitmq`, `postfix`
feat(cloud): single-shot deploy.sh + FROST stack + healthchecks Stage 5 — make the cloud composition spin up in one command and add the SensorThings (FROST) stack as a fully segregated tenant. cloud/deploy.sh — idempotent, 7-step bring-up: preflight → validate → up + wait → cert state → issue/renew → service status → endpoint smoke test. Reissues LE cert only when current issuer no longer matches ACME_CA_URI. Move-aside-then- restore-on-failure so the bootstrap cert survives a failed certbot. stacks/frost — new stack, segregated from shared sql/rabbitmq: - dedicated postgis container (frost-db) - dedicated internal mosquitto bus (frost-mosquitto) - frost-http + frost-mqtt on a private frost-internal network, joined to cloud-app only for nginx ingress at frost.wbd-rd.nl - shared mosquitto stack deleted; rabbitmq remains the only public MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy) stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate on service_healthy via cloud-level depends_on overrides. stacks/nginx-proxy: - nginx-init service generates a self-signed bootstrap cert on fresh deploy so nginx starts before certbot has issued a real one - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080, /mqtt → frost-mqtt:9876 WebSocket) stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the official image can speak to the shared sql backend. stacks/jupyterhub — DummyAuthenticator stub gated by JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner. stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs (management + mqtt plugins, MQTT auth required). stacks/portainer — ports unpublished; nginx now the only ingress. stacks/node-red — pin to 4.1 (the floating "4" tag does not exist). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00
- **Cloud-only**: `wireguard-server`, `gitea` (HTTPS), `jenkins`, `sql` (postgres), `mlflow`, `jupyterhub`, `frost` (SensorThings, dedicated postgis + internal bus)
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
- **Edge-only**: `wireguard-client`
## Sites
| Site | Status |
|---|---|
| `gemaal1` | Scaffolded; awaiting hardware provisioning (PLANT_LAN_IP, WG peer key, OPCUA endpoint) |
Additional plants follow the same pattern under `sites/<plant>/`.
## Conventions
- **Folder names**: kebab-case (`node-red`, `nginx-proxy`, `wireguard-server`).
- **Compose filename**: `compose.yml` (official Compose Spec).
- **Composition**: cloud/site composes pull stacks via `include:`. Stacks remain runnable standalone for testing.
- **Secrets**: `.env` (gitignored) + `.env.example` (committed with placeholders).
- **Per-stack contents**: `compose.yml`, `.env.example`, `README.md`, optional `config/`.
- **OT layer**: out of scope; PLC + OPCUA managed in a separate process.
## Open decisions
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
Tracked here so we don't forget. Each lands when we harden the relevant stack.
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
- **MinIO / artifact store** — MLflow uses local volume for now; switch to S3-compatible MinIO sidecar when artifacts grow.
- **JupyterHub auth** — target Keycloak OIDC via `oauthenticator.generic.GenericOAuthenticator`.
- **WG client routing** — split-tunnel vs full; per-peer `AllowedIPs` policy.
feat(cloud): single-shot deploy.sh + FROST stack + healthchecks Stage 5 — make the cloud composition spin up in one command and add the SensorThings (FROST) stack as a fully segregated tenant. cloud/deploy.sh — idempotent, 7-step bring-up: preflight → validate → up + wait → cert state → issue/renew → service status → endpoint smoke test. Reissues LE cert only when current issuer no longer matches ACME_CA_URI. Move-aside-then- restore-on-failure so the bootstrap cert survives a failed certbot. stacks/frost — new stack, segregated from shared sql/rabbitmq: - dedicated postgis container (frost-db) - dedicated internal mosquitto bus (frost-mosquitto) - frost-http + frost-mqtt on a private frost-internal network, joined to cloud-app only for nginx ingress at frost.wbd-rd.nl - shared mosquitto stack deleted; rabbitmq remains the only public MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy) stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate on service_healthy via cloud-level depends_on overrides. stacks/nginx-proxy: - nginx-init service generates a self-signed bootstrap cert on fresh deploy so nginx starts before certbot has issued a real one - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080, /mqtt → frost-mqtt:9876 WebSocket) stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the official image can speak to the shared sql backend. stacks/jupyterhub — DummyAuthenticator stub gated by JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner. stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs (management + mqtt plugins, MQTT auth required). stacks/portainer — ports unpublished; nginx now the only ingress. stacks/node-red — pin to 4.1 (the floating "4" tag does not exist). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00
- **FROST auth** — currently `BasicAuthProvider` against the USERS table in `frost-db`; swap to Keycloak OIDC via the FROST OIDC plugin when SSO is rolled out.
feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks. Locked decisions - sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning - nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO). Chose stock over nginxproxy/nginx-proxy because stream{} is required for MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883. - gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published. MQTT split - Remove stacks/mqtt placeholder. - Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin), used at both cloud and edge. External MQTT clients reach cloud broker via nginx stream-proxy on 8883. - Add stacks/mosquitto — reserved for the FROST (SensorThings) stack only. Cloud-only. Internal to its own stack; no external ingress. ML / notebooks (cloud-only) - stacks/mlflow — experiment tracking + model registry. Postgres backend on sql stack; local volume for artifacts (S3/MinIO is a TODO). - stacks/jupyterhub — multi-user notebook server. DockerSpawner via mounted docker.sock; users spawn into cloud-app network so they can reach mlflow, influxdb (via grafana), rabbitmq. Sites - sites/gemaal1 — first edge deployment scaffold. Site-local override template for binding nginx to PLANT_LAN_IP. Docs - README + docs/architecture.md updated: stacks table now lists 15 stacks, ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy section locked, MQTT-split section added, Gitea HTTPS-only noted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 13:22:46 +02:00
- **MQTT cross-broker shovel** — only if FROST consumers must see RabbitMQ traffic or vice versa.
- **Internal PKI** — for cloud-internal hostnames not eligible for Let's Encrypt HTTP-01.
- **Backup strategy** — for `sql` (postgres), `influxdb`, `gitea-data`, `jenkins-home`, `mlflow-artifacts`.
- **Provision Gemaal1** — fill in `PLANT_LAN_IP`, WG peer key, OPCUA endpoint, deploy first stacks.