Files
infra/docs/architecture.md
znetsixe 8ab9061983 scaffold: hub-and-spoke layout, 4-network topology, 13 stack stubs
Initial structure for R&D infrastructure:

- stacks/ — 13 reusable, runnable stack stubs (kebab-case)
  cloud-and-edge: node-red, influxdb, grafana, keycloak, portainer,
                  nginx-proxy, mqtt, postfix
  cloud-only:     wireguard-server, gitea, jenkins, sql (postgres stub)
  edge-only:      wireguard-client

- cloud/ — single central hub composition with 4 networks
           (edge, app, data internal, mgmt) and include: stubs
- sites/ — per-plant edge folders (template README only for now)
- docs/architecture.md — hub-and-spoke + ingress + segmentation rationale

Network model: only nginx-proxy (80/443/8883) and wireguard-server
(51820/udp) publish ports on the cloud host. Edge nginx publishes
80/443 on plant-LAN interface only. MQTT cloud-side via nginx stream
proxy; MQTT edge-side internal-only; Postfix outbound-only.

OT layer (OPCUA, PLCs) is out of scope for this repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 12:37:59 +02:00

6.3 KiB

Architecture

R&D infrastructure for Waterschap Brabantse Delta. Hub-and-spoke topology:

  • Cloud layer — central services, one deployment, internet-facing.
  • Edge layer — per-plant, plant-LAN-facing, tunneled to cloud via WireGuard.
  • OT layer — per-plant, behind edge. Managed outside this repo.
                Internet
                    │
        ┌───────────┴───────────┐
        │ tcp/80, 443, 8883     │
        │ udp/51820             │
        ▼                       │
┌───────────────────────────────────┐
│ Cloud (central, one)              │
│   nginx-proxy ◀── 80/443/8883     │
│   wireguard-server ◀── 51820/udp  │
│   gitea, jenkins, keycloak, ...   │
│   influxdb, grafana, node-red     │
│   mqtt, postfix, portainer        │
│   sql (single point of config)    │
└───────────────┬───────────────────┘
                │ WireGuard tunnels
        ┌───────┼────────┬───────────┐
        ▼       ▼        ▼           ▼
    ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
    │ Edge: │ │ Edge: │ │ Edge: │ │  ...  │
    │plant1 │ │plant2 │ │plant3 │ │       │
    └───┬───┘ └───────┘ └───────┘ └───────┘
        │ TLS
        ▼
    ┌──────────┐
    │ OT       │  ← out of scope of this repo
    │ OPCUA    │
    │ PLC      │
    └──────────┘

Network topology (per layer)

Each layer uses four internal Docker networks:

Network Purpose Notes
edge Outermost. Cloud: internet-facing. Edge: plant-LAN-facing. Only port-publishers join.
app Application / automation tier (node-red, grafana, jenkins, gitea, …). Default landing for app services.
data Databases (influxdb, sql). internal: true — no internet egress.
mgmt Identity, control plane (portainer, keycloak admin, wireguard mgmt). Restricted.

Cloud attachments

edge   : nginx-proxy, wireguard-server
app    : nginx-proxy, mqtt, postfix, node-red, grafana,
         jenkins, gitea, keycloak
data   : influxdb, sql, grafana
mgmt   : portainer, keycloak, wireguard-server

Edge attachments

edge   : nginx-proxy                              ← plant-LAN-facing
app    : nginx-proxy, mqtt, postfix, node-red,
         grafana, keycloak, wireguard-client
data   : influxdb, grafana
mgmt   : portainer, keycloak, wireguard-client

Ingress (the only ports facing outside)

Cloud (the central host)

Port Container Notes
tcp/80 nginx-proxy HTTP → 301 to 443
tcp/443 nginx-proxy All HTTPS UIs; TLS termination
tcp/8883 nginx-proxy MQTT-TLS via stream {} block, SNI route to broker
udp/51820 wireguard-server VPN tunnel ingress

Two containers publish a total of four ports. Everything else is invisible from outside the host.

Edge (per-plant gateway)

Port Container Bound to
tcp/80 nginx-proxy Plant-LAN interface only
tcp/443 nginx-proxy Plant-LAN interface only

The edge wireguard-client initiates outbound to the cloud — it publishes no port. On-site operators reach SCADA on the plant LAN; remote ops reach the same nginx via the WG tunnel.

Why segment

  • Blast radius: a compromised node-red on app cannot reach influxdb on data unless an explicit attachment is declared. Each service's reachability is auditable from networks: alone.
  • Defense in depth: only nginx-proxy and wireguard-server bind host ports. No accidental 0.0.0.0 exposures.
  • NIS2 / utility audit: WBD is in scope as water-sector. Compose networks are a cheap way to evidence segmentation at runtime and on paper.

Special cases

Postfix (cloud + edge)

The diagram labels it "OUT ONLY". Postfix initiates outbound SMTP to internet MX servers but accepts no inbound mail. So Postfix has zero ingress, no published port, no listener facing the internet. It just needs egress (which every container has via host NAT).

MQTT (cloud)

nginx-proxy stream {} block reverse-proxies tcp/8883 to the internal broker via SNI. The broker has no published port. Cleanest "everything through nginx" model.

MQTT (edge)

Broker is fully internal to the edge stack — no plant-LAN ingress. Node-RED on edge bridges OPCUA → broker, then broker → cloud broker over the WG tunnel. Field devices that need MQTT publish to the cloud broker via WG, not to the edge broker directly.

WireGuard server (cloud)

WireGuard is connectionless UDP with crypto-routed packets. It cannot be sensibly reverse-proxied (NAT/MTU break, no security benefit). The server publishes udp/51820 directly — the only non-nginx public ingress on cloud.

Conventions

  • Folder names: kebab-case (node-red, nginx-proxy, wireguard-server).
  • Compose filename: compose.yml (official Compose Spec).
  • Composition: cloud/site composes pull stacks via include:. Stacks remain runnable standalone for testing.
  • Secrets: .env (gitignored) + .env.example (committed with placeholders).
  • Per-stack contents: compose.yml, .env.example, README.md, optional config/.
  • OT layer: out of scope; PLC + OPCUA managed in a separate process.

Open decisions

These are deferred until we build the respective stack. Tracked here so we don't forget.

  • SQL flavor: postgres / mariadb / mysql? Leaning postgres for the "single point of config" use case.
  • SSL strategy: certbot inside nginx-proxy, acme-companion sidecar, or step-ca-driven internal PKI? Probably acme-companion against Let's Encrypt for external endpoints + internal PKI for service-to-service.
  • Keycloak storage: bundled H2 (dev only) vs external SQL backend (probably the same sql stack).
  • Backup strategy for data (influxdb, sql) and mgmt (gitea, jenkins workspaces).
  • First site: which plant gets sites/<plant>/ scaffolded first?