# Architecture R&D infrastructure for Waterschap Brabantse Delta. Hub-and-spoke topology: - **Cloud** layer — central services, one deployment, internet-facing. - **Edge** layer — per-plant, plant-LAN-facing, tunneled to cloud via WireGuard. - **OT** layer — per-plant, behind edge. Managed **outside** this repo. ``` Internet │ ┌───────────┴───────────┐ │ tcp/80, 443, 8883 │ │ udp/51820 │ ▼ │ ┌────────────────────────────────────┐ │ Cloud (central, one) │ │ nginx + certbot ◀── 80/443/8883 │ │ wireguard-server ◀── 51820/udp │ │ gitea, jenkins, keycloak, ... │ │ influxdb, grafana, node-red │ │ rabbitmq, postfix, portainer │ │ sql (postgres, single config) │ │ mlflow, jupyterhub │ │ frost (SensorThings API) │ └───────────────┬────────────────────┘ │ WireGuard tunnels ┌───────┼────────┬───────────┐ ▼ ▼ ▼ ▼ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ │ Edge: │ │ Edge: │ │ Edge: │ │ ... │ │gemaal1│ │ ... │ │ ... │ │ │ └───┬───┘ └───────┘ └───────┘ └───────┘ │ TLS ▼ ┌──────────┐ │ OT │ ← out of scope of this repo │ OPCUA │ │ PLC │ └──────────┘ ``` ## Network topology (per layer) Each layer uses **four internal Docker networks**: | Network | Purpose | Notes | |---|---|---| | `edge` | Outermost. Cloud: internet-facing. Edge: plant-LAN-facing. | Only port-publishers join. | | `app` | Application / automation tier. | Default landing for app services. | | `data` | Databases (influxdb, sql). | `internal: true` — no internet egress. | | `mgmt` | Identity, control plane (portainer, keycloak admin, wireguard mgmt). | Restricted. | ### Cloud attachments ``` edge : nginx, wireguard-server app : nginx, rabbitmq, postfix, node-red, grafana, jenkins, gitea, keycloak, mlflow, jupyterhub, portainer, frost-http, frost-mqtt data : influxdb, sql, grafana, mlflow mgmt : portainer, keycloak, wireguard-server, jupyterhub frost-internal (private to frost stack) : frost-db (postgis), frost-mosquitto, frost-http, frost-mqtt ``` ### Edge attachments ``` edge : nginx ← plant-LAN-facing app : nginx, rabbitmq, postfix, node-red, grafana, keycloak, wireguard-client data : influxdb, grafana mgmt : portainer, keycloak, wireguard-client ``` ## Ingress (the only ports facing outside) ### Cloud (the central host) | Port | Container | Notes | |---|---|---| | `tcp/80` | nginx | HTTP → 301 to 443; also serves `/.well-known/acme-challenge/` for certbot | | `tcp/443` | nginx | All HTTPS UIs; TLS termination | | `tcp/8883` | nginx | MQTT-TLS via `stream {}` block; SNI route to `rabbitmq:1883` | | `udp/51820` | wireguard-server | VPN tunnel ingress | Two containers publish a total of four ports. **Everything else is invisible** from outside the host. ### Edge (per-plant gateway) | Port | Container | Bound to | |---|---|---| | `tcp/80` | nginx | Plant-LAN interface only | | `tcp/443` | nginx | Plant-LAN interface only | The edge `wireguard-client` initiates outbound to the cloud — it publishes **no port**. ## TLS strategy **Stock nginx + certbot sidecar** (Let's Encrypt, HTTP-01 webroot). - Stock `nginx:1.27-alpine` — required because we use the `stream {}` context for MQTT-TLS. `nginxproxy/nginx-proxy` (the jwilder image) is HTTP/HTTPS-only and can't expose stream cleanly. - `certbot/certbot` sidecar runs `certbot renew` every 12h. Shared `nginx-certs` + `nginx-acme-challenge` volumes coordinate cert + challenge state between the two containers. - Initial issuance is **manual** (one-time `docker compose run --rm certbot certonly --webroot …`). Renewal is automatic. For cloud-internal hostnames not reachable via Let's Encrypt HTTP-01, the longer-term plan is a small internal PKI (step-ca or similar) backed by `sql`. Out of scope for first deploy. ## Why segment - **Blast radius**: a compromised node-red on `app` cannot reach influxdb on `data` unless an explicit attachment is declared. Each service's reachability is auditable from `networks:` alone. - **Defense in depth**: only nginx and wireguard-server bind host ports. No accidental `0.0.0.0` exposures. - **NIS2 / utility audit**: WBD is in scope as water-sector. Compose networks evidence segmentation at runtime and on paper. ## Special cases ### Postfix (cloud + edge) Postfix is **outbound-only**. It initiates SMTP to internet MX servers but accepts no inbound. Zero ingress, no published port, no listener facing internet. Just needs egress (every container has it via host NAT). ### MQTT — RabbitMQ for public traffic, dedicated mosquitto inside FROST - **RabbitMQ** is the **only public MQTT broker**. SCADA / IoT / edge clients connect to `mqtt.wbd-rd.nl:8883` (TLS, via nginx `stream {}` block proxying to `rabbitmq:1883`). Authentication uses the standard RABBITMQ_USER/PASS. - **frost-mosquitto** lives **inside the frost stack** on the private `frost-internal` docker network — it is purely the message bus between `frost-http` and `frost-mqtt`. It is not reachable from anywhere outside the frost stack. - SensorThings-protocol MQTT (the FROST native MQTT API) is exposed to clients via `frost-mqtt`'s WebSocket port, proxied as `https://sta.wbd-rd.nl/mqtt`. If FROST consumers also need to see SCADA traffic on RabbitMQ, add a RabbitMQ `shovel` plugin pointing into the frost stack. Not wired up by default. ### Gitea — HTTPS only No SSH ingress. `GITEA__server__DISABLE_SSH=true`. All clones over HTTPS via nginx-proxy. Re-evaluate only if Gitea Actions runners require SSH push. ### WireGuard server (cloud) WireGuard is connectionless UDP with crypto-routed packets. Proxying through nginx-stream breaks NAT/MTU and adds no security benefit. The server publishes `udp/51820` directly — the **only** non-nginx public ingress on cloud. ## Stacks The repo defines **15 stacks** under `stacks/`: - **Cloud + edge**: `nginx-proxy`, `node-red`, `influxdb`, `grafana`, `keycloak`, `portainer`, `rabbitmq`, `postfix` - **Cloud-only**: `wireguard-server`, `gitea` (HTTPS), `jenkins`, `sql` (postgres), `mlflow`, `jupyterhub`, `frost` (SensorThings, dedicated postgis + internal bus) - **Edge-only**: `wireguard-client` ## Sites | Site | Status | |---|---| | `gemaal1` | Scaffolded; awaiting hardware provisioning (PLANT_LAN_IP, WG peer key, OPCUA endpoint) | Additional plants follow the same pattern under `sites//`. ## Conventions - **Folder names**: kebab-case (`node-red`, `nginx-proxy`, `wireguard-server`). - **Compose filename**: `compose.yml` (official Compose Spec). - **Composition**: cloud/site composes pull stacks via `include:`. Stacks remain runnable standalone for testing. - **Secrets**: `.env` (gitignored) + `.env.example` (committed with placeholders). - **Per-stack contents**: `compose.yml`, `.env.example`, `README.md`, optional `config/`. - **OT layer**: out of scope; PLC + OPCUA managed in a separate process. ## Open decisions Tracked here so we don't forget. Each lands when we harden the relevant stack. - **MinIO / artifact store** — MLflow uses local volume for now; switch to S3-compatible MinIO sidecar when artifacts grow. - **JupyterHub auth** — target Keycloak OIDC via `oauthenticator.generic.GenericOAuthenticator`. - **WG client routing** — split-tunnel vs full; per-peer `AllowedIPs` policy. - **FROST auth** — currently `BasicAuthProvider` against the USERS table in `frost-db`; swap to Keycloak OIDC via the FROST OIDC plugin when SSO is rolled out. - **MQTT cross-broker shovel** — only if FROST consumers must see RabbitMQ traffic or vice versa. - **Internal PKI** — for cloud-internal hostnames not eligible for Let's Encrypt HTTP-01. - **Backup strategy** — for `sql` (postgres), `influxdb`, `gitea-data`, `jenkins-home`, `mlflow-artifacts`. - **Provision Gemaal1** — fill in `PLANT_LAN_IP`, WG peer key, OPCUA endpoint, deploy first stacks.