scaffold: hub-and-spoke layout, 4-network topology, 13 stack stubs

Initial structure for R&D infrastructure:

- stacks/ — 13 reusable, runnable stack stubs (kebab-case)
  cloud-and-edge: node-red, influxdb, grafana, keycloak, portainer,
                  nginx-proxy, mqtt, postfix
  cloud-only:     wireguard-server, gitea, jenkins, sql (postgres stub)
  edge-only:      wireguard-client

- cloud/ — single central hub composition with 4 networks
           (edge, app, data internal, mgmt) and include: stubs
- sites/ — per-plant edge folders (template README only for now)
- docs/architecture.md — hub-and-spoke + ingress + segmentation rationale

Network model: only nginx-proxy (80/443/8883) and wireguard-server
(51820/udp) publish ports on the cloud host. Edge nginx publishes
80/443 on plant-LAN interface only. MQTT cloud-side via nginx stream
proxy; MQTT edge-side internal-only; Postfix outbound-only.

OT layer (OPCUA, PLCs) is out of scope for this repo.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
znetsixe
2026-05-21 12:37:59 +02:00
commit 8ab9061983
46 changed files with 823 additions and 0 deletions

135
docs/architecture.md Normal file
View File

@@ -0,0 +1,135 @@
# Architecture
R&D infrastructure for Waterschap Brabantse Delta. Hub-and-spoke topology:
- **Cloud** layer — central services, one deployment, internet-facing.
- **Edge** layer — per-plant, plant-LAN-facing, tunneled to cloud via WireGuard.
- **OT** layer — per-plant, behind edge. Managed **outside** this repo.
```
Internet
┌───────────┴───────────┐
│ tcp/80, 443, 8883 │
│ udp/51820 │
▼ │
┌───────────────────────────────────┐
│ Cloud (central, one) │
│ nginx-proxy ◀── 80/443/8883 │
│ wireguard-server ◀── 51820/udp │
│ gitea, jenkins, keycloak, ... │
│ influxdb, grafana, node-red │
│ mqtt, postfix, portainer │
│ sql (single point of config) │
└───────────────┬───────────────────┘
│ WireGuard tunnels
┌───────┼────────┬───────────┐
▼ ▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
│ Edge: │ │ Edge: │ │ Edge: │ │ ... │
│plant1 │ │plant2 │ │plant3 │ │ │
└───┬───┘ └───────┘ └───────┘ └───────┘
│ TLS
┌──────────┐
│ OT │ ← out of scope of this repo
│ OPCUA │
│ PLC │
└──────────┘
```
## Network topology (per layer)
Each layer uses **four internal Docker networks**:
| Network | Purpose | Notes |
|---|---|---|
| `edge` | Outermost. Cloud: internet-facing. Edge: plant-LAN-facing. | Only port-publishers join. |
| `app` | Application / automation tier (node-red, grafana, jenkins, gitea, …). | Default landing for app services. |
| `data` | Databases (influxdb, sql). | `internal: true` — no internet egress. |
| `mgmt` | Identity, control plane (portainer, keycloak admin, wireguard mgmt). | Restricted. |
### Cloud attachments
```
edge : nginx-proxy, wireguard-server
app : nginx-proxy, mqtt, postfix, node-red, grafana,
jenkins, gitea, keycloak
data : influxdb, sql, grafana
mgmt : portainer, keycloak, wireguard-server
```
### Edge attachments
```
edge : nginx-proxy ← plant-LAN-facing
app : nginx-proxy, mqtt, postfix, node-red,
grafana, keycloak, wireguard-client
data : influxdb, grafana
mgmt : portainer, keycloak, wireguard-client
```
## Ingress (the only ports facing outside)
### Cloud (the central host)
| Port | Container | Notes |
|---|---|---|
| `tcp/80` | nginx-proxy | HTTP → 301 to 443 |
| `tcp/443` | nginx-proxy | All HTTPS UIs; TLS termination |
| `tcp/8883` | nginx-proxy | MQTT-TLS via `stream {}` block, SNI route to broker |
| `udp/51820` | wireguard-server | VPN tunnel ingress |
Two containers publish a total of four ports. **Everything else is invisible** from outside the host.
### Edge (per-plant gateway)
| Port | Container | Bound to |
|---|---|---|
| `tcp/80` | nginx-proxy | Plant-LAN interface only |
| `tcp/443` | nginx-proxy | Plant-LAN interface only |
The edge `wireguard-client` initiates outbound to the cloud — it publishes **no port**. On-site operators reach SCADA on the plant LAN; remote ops reach the same nginx via the WG tunnel.
## Why segment
- **Blast radius**: a compromised node-red on `app` cannot reach influxdb on `data` unless an explicit attachment is declared. Each service's reachability is auditable from `networks:` alone.
- **Defense in depth**: only nginx-proxy and wireguard-server bind host ports. No accidental `0.0.0.0` exposures.
- **NIS2 / utility audit**: WBD is in scope as water-sector. Compose networks are a cheap way to evidence segmentation at runtime and on paper.
## Special cases
### Postfix (cloud + edge)
The diagram labels it "OUT ONLY". Postfix initiates outbound SMTP to internet MX servers but accepts **no inbound** mail. So Postfix has zero ingress, no published port, no listener facing the internet. It just needs egress (which every container has via host NAT).
### MQTT (cloud)
nginx-proxy `stream {}` block reverse-proxies `tcp/8883` to the internal broker via SNI. The broker has **no published port**. Cleanest "everything through nginx" model.
### MQTT (edge)
Broker is **fully internal** to the edge stack — no plant-LAN ingress. Node-RED on edge bridges OPCUA → broker, then broker → cloud broker over the WG tunnel. Field devices that need MQTT publish to the cloud broker via WG, not to the edge broker directly.
### WireGuard server (cloud)
WireGuard is connectionless UDP with crypto-routed packets. It cannot be sensibly reverse-proxied (NAT/MTU break, no security benefit). The server publishes `udp/51820` directly — the **only** non-nginx public ingress on cloud.
## Conventions
- **Folder names**: kebab-case (`node-red`, `nginx-proxy`, `wireguard-server`).
- **Compose filename**: `compose.yml` (official Compose Spec).
- **Composition**: cloud/site composes pull stacks via `include:`. Stacks remain runnable standalone for testing.
- **Secrets**: `.env` (gitignored) + `.env.example` (committed with placeholders).
- **Per-stack contents**: `compose.yml`, `.env.example`, `README.md`, optional `config/`.
- **OT layer**: out of scope; PLC + OPCUA managed in a separate process.
## Open decisions
These are deferred until we build the respective stack. Tracked here so we don't forget.
- **SQL flavor**: postgres / mariadb / mysql? Leaning postgres for the "single point of config" use case.
- **SSL strategy**: certbot inside nginx-proxy, acme-companion sidecar, or step-ca-driven internal PKI? Probably acme-companion against Let's Encrypt for external endpoints + internal PKI for service-to-service.
- **Keycloak storage**: bundled H2 (dev only) vs external SQL backend (probably the same `sql` stack).
- **Backup strategy** for `data` (influxdb, sql) and `mgmt` (gitea, jenkins workspaces).
- **First site**: which plant gets `sites/<plant>/` scaffolded first?