Files
infra/README.md
znetsixe 4117ec6063 feat(cloud): single-shot deploy.sh + FROST stack + healthchecks
Stage 5 — make the cloud composition spin up in one command and add
the SensorThings (FROST) stack as a fully segregated tenant.

cloud/deploy.sh — idempotent, 7-step bring-up:
  preflight → validate → up + wait → cert state → issue/renew →
  service status → endpoint smoke test. Reissues LE cert only when
  current issuer no longer matches ACME_CA_URI. Move-aside-then-
  restore-on-failure so the bootstrap cert survives a failed certbot.

stacks/frost — new stack, segregated from shared sql/rabbitmq:
  - dedicated postgis container (frost-db)
  - dedicated internal mosquitto bus (frost-mosquitto)
  - frost-http + frost-mqtt on a private frost-internal network,
    joined to cloud-app only for nginx ingress at frost.wbd-rd.nl
  - shared mosquitto stack deleted; rabbitmq remains the only public
    MQTT broker (mqtt.wbd-rd.nl:8883 via stream proxy)

stacks/sql — pg_isready healthcheck so keycloak/gitea/mlflow can gate
on service_healthy via cloud-level depends_on overrides.

stacks/nginx-proxy:
  - nginx-init service generates a self-signed bootstrap cert on
    fresh deploy so nginx starts before certbot has issued a real one
  - frost.wbd-rd.nl vhost (/FROST-Server → frost-http:8080,
    /mqtt → frost-mqtt:9876 WebSocket)

stacks/mlflow — custom Dockerfile (upstream + psycopg2-binary) so the
official image can speak to the shared sql backend.

stacks/jupyterhub — DummyAuthenticator stub gated by
JUPYTERHUB_ADMIN_PASSWORD; TODO comments point at OIDC + DockerSpawner.

stacks/rabbitmq — config/{enabled_plugins,rabbitmq.conf} stubs
(management + mqtt plugins, MQTT auth required).

stacks/portainer — ports unpublished; nginx now the only ingress.

stacks/node-red — pin to 4.1 (the floating "4" tag does not exist).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 16:37:58 +02:00

69 lines
2.2 KiB
Markdown

# infra
R&D infrastructure stacks for Waterschap Brabantse Delta. Hub-and-spoke deployment: one **cloud** central hub + per-plant **edge** sites.
## Layout
```
infra/
├── stacks/ # reusable, runnable stack defs (kebab-case)
├── cloud/ # the single central hub
├── sites/ # per-plant edge deployments
└── docs/ # architecture + conventions
```
Stacks are pulled into the cloud and site composes via the Compose Spec `include:` directive. Each stack is also runnable standalone for testing.
## Quick start
```bash
# Cloud hub (run on the central server)
cd cloud
cp .env.example .env # fill in real secrets
docker compose up -d
# A plant edge (run on the edge gateway at the plant)
cd sites/<plant>
cp .env.example .env
docker compose up -d
```
## Stacks
| Stack | Purpose | Cloud | Edge |
|---|---|:---:|:---:|
| node-red | Flow-based automation | ✓ | ✓ |
| influxdb | Time-series database | ✓ | ✓ |
| grafana | Dashboards / SCADA | ✓ | ✓ |
| keycloak | Identity / SSO | ✓ | ✓ |
| portainer | Container management UI | ✓ | ✓ |
| nginx-proxy | Stock nginx + certbot sidecar | ✓ | ✓ |
| rabbitmq | General-purpose broker (AMQP + MQTT plugin) | ✓ | ✓ |
| postfix | Outbound mail relay | ✓ | ✓ |
| wireguard-server | VPN server | ✓ | — |
| wireguard-client | VPN client | — | ✓ |
| gitea | Git server (HTTPS-only) | ✓ | — |
| jenkins | CI/CD | ✓ | — |
| sql | Config DB (postgres 16) | ✓ | — |
| mlflow | ML experiment tracking + registry | ✓ | — |
| jupyterhub | Multi-user notebook server | ✓ | — |
| frost | OGC SensorThings API (postgis + dedicated bus) | ✓ | — |
## Sites
| Site | Status |
|---|---|
| gemaal1 | Scaffolded — awaiting hardware provisioning |
## Design
See [`docs/architecture.md`](docs/architecture.md) for the hub-and-spoke topology, 4-network model, ingress table, and the reasoning behind each choice.
## Conventions
- kebab-case folder names
- `compose.yml` (Compose Spec), not `docker-compose.yml`
- Stack composes pulled into cloud/site via `include:`
- Secrets in `.env` files (gitignored); `.env.example` committed with placeholders
- OT layer (OPCUA, PLCs) is **out of scope** for this repo