RnD/infra

Files

znetsixe 8ab9061983 scaffold: hub-and-spoke layout, 4-network topology, 13 stack stubs

Initial structure for R&D infrastructure:

- stacks/ — 13 reusable, runnable stack stubs (kebab-case)
  cloud-and-edge: node-red, influxdb, grafana, keycloak, portainer,
                  nginx-proxy, mqtt, postfix
  cloud-only:     wireguard-server, gitea, jenkins, sql (postgres stub)
  edge-only:      wireguard-client

- cloud/ — single central hub composition with 4 networks
           (edge, app, data internal, mgmt) and include: stubs
- sites/ — per-plant edge folders (template README only for now)
- docs/architecture.md — hub-and-spoke + ingress + segmentation rationale

Network model: only nginx-proxy (80/443/8883) and wireguard-server
(51820/udp) publish ports on the cloud host. Edge nginx publishes
80/443 on plant-LAN interface only. MQTT cloud-side via nginx stream
proxy; MQTT edge-side internal-only; Postfix outbound-only.

OT layer (OPCUA, PLCs) is out of scope for this repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-21 12:37:59 +02:00

6.3 KiB

Raw Blame History

Architecture

R&D infrastructure for Waterschap Brabantse Delta. Hub-and-spoke topology:

Cloud layer — central services, one deployment, internet-facing.
Edge layer — per-plant, plant-LAN-facing, tunneled to cloud via WireGuard.
OT layer — per-plant, behind edge. Managed outside this repo.

                Internet
                    │
        ┌───────────┴───────────┐
        │ tcp/80, 443, 8883     │
        │ udp/51820             │
        ▼                       │
┌───────────────────────────────────┐
│ Cloud (central, one)              │
│   nginx-proxy ◀── 80/443/8883     │
│   wireguard-server ◀── 51820/udp  │
│   gitea, jenkins, keycloak, ...   │
│   influxdb, grafana, node-red     │
│   mqtt, postfix, portainer        │
│   sql (single point of config)    │
└───────────────┬───────────────────┘
                │ WireGuard tunnels
        ┌───────┼────────┬───────────┐
        ▼       ▼        ▼           ▼
    ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
    │ Edge: │ │ Edge: │ │ Edge: │ │  ...  │
    │plant1 │ │plant2 │ │plant3 │ │       │
    └───┬───┘ └───────┘ └───────┘ └───────┘
        │ TLS
        ▼
    ┌──────────┐
    │ OT       │  ← out of scope of this repo
    │ OPCUA    │
    │ PLC      │
    └──────────┘

Network topology (per layer)

Each layer uses four internal Docker networks:

Network	Purpose	Notes
`edge`	Outermost. Cloud: internet-facing. Edge: plant-LAN-facing.	Only port-publishers join.
`app`	Application / automation tier (node-red, grafana, jenkins, gitea, …).	Default landing for app services.
`data`	Databases (influxdb, sql).	`internal: true` — no internet egress.
`mgmt`	Identity, control plane (portainer, keycloak admin, wireguard mgmt).	Restricted.

Cloud attachments

edge   : nginx-proxy, wireguard-server
app    : nginx-proxy, mqtt, postfix, node-red, grafana,
         jenkins, gitea, keycloak
data   : influxdb, sql, grafana
mgmt   : portainer, keycloak, wireguard-server

Edge attachments

edge   : nginx-proxy                              ← plant-LAN-facing
app    : nginx-proxy, mqtt, postfix, node-red,
         grafana, keycloak, wireguard-client
data   : influxdb, grafana
mgmt   : portainer, keycloak, wireguard-client

Ingress (the only ports facing outside)

Cloud (the central host)

Port	Container	Notes
`tcp/80`	nginx-proxy	HTTP → 301 to 443
`tcp/443`	nginx-proxy	All HTTPS UIs; TLS termination
`tcp/8883`	nginx-proxy	MQTT-TLS via `stream {}` block, SNI route to broker
`udp/51820`	wireguard-server	VPN tunnel ingress

Two containers publish a total of four ports. Everything else is invisible from outside the host.

Edge (per-plant gateway)

Port	Container	Bound to
`tcp/80`	nginx-proxy	Plant-LAN interface only
`tcp/443`	nginx-proxy	Plant-LAN interface only

The edge wireguard-client initiates outbound to the cloud — it publishes no port. On-site operators reach SCADA on the plant LAN; remote ops reach the same nginx via the WG tunnel.

Why segment

Blast radius: a compromised node-red on app cannot reach influxdb on data unless an explicit attachment is declared. Each service's reachability is auditable from networks: alone.
Defense in depth: only nginx-proxy and wireguard-server bind host ports. No accidental 0.0.0.0 exposures.
NIS2 / utility audit: WBD is in scope as water-sector. Compose networks are a cheap way to evidence segmentation at runtime and on paper.

Special cases

Postfix (cloud + edge)

The diagram labels it "OUT ONLY". Postfix initiates outbound SMTP to internet MX servers but accepts no inbound mail. So Postfix has zero ingress, no published port, no listener facing the internet. It just needs egress (which every container has via host NAT).

MQTT (cloud)

nginx-proxy stream {} block reverse-proxies tcp/8883 to the internal broker via SNI. The broker has no published port. Cleanest "everything through nginx" model.

MQTT (edge)

Broker is fully internal to the edge stack — no plant-LAN ingress. Node-RED on edge bridges OPCUA → broker, then broker → cloud broker over the WG tunnel. Field devices that need MQTT publish to the cloud broker via WG, not to the edge broker directly.

WireGuard server (cloud)

WireGuard is connectionless UDP with crypto-routed packets. It cannot be sensibly reverse-proxied (NAT/MTU break, no security benefit). The server publishes udp/51820 directly — the only non-nginx public ingress on cloud.

Conventions

Folder names: kebab-case (node-red, nginx-proxy, wireguard-server).
Compose filename: compose.yml (official Compose Spec).
Composition: cloud/site composes pull stacks via include:. Stacks remain runnable standalone for testing.
Secrets: .env (gitignored) + .env.example (committed with placeholders).
Per-stack contents: compose.yml, .env.example, README.md, optional config/.
OT layer: out of scope; PLC + OPCUA managed in a separate process.

Open decisions

These are deferred until we build the respective stack. Tracked here so we don't forget.

SQL flavor: postgres / mariadb / mysql? Leaning postgres for the "single point of config" use case.
SSL strategy: certbot inside nginx-proxy, acme-companion sidecar, or step-ca-driven internal PKI? Probably acme-companion against Let's Encrypt for external endpoints + internal PKI for service-to-service.
Keycloak storage: bundled H2 (dev only) vs external SQL backend (probably the same sql stack).
Backup strategy for data (influxdb, sql) and mgmt (gitea, jenkins workspaces).
First site: which plant gets sites/<plant>/ scaffolded first?

6.3 KiB Raw Blame History