feat(sso): wire Keycloak SSO end-to-end across all apps

New stack:
- stacks/oauth2-proxy/ — per-app sidecars (mlflow, portainer, rabbitmq)
  that gate vhosts via nginx auth_request against Keycloak's wbd realm.

Native OIDC wired into:
- grafana       (generic_oauth, role-attribute-path → Admin/Editor/Viewer)
- jupyterhub    (oauthenticator.GenericOAuthenticator)
- node-red      (passport-openidconnect; in-memory state store + users()
                 resolver because adminAuth doesn't expose req.session)
- jenkins       (oic-auth plugin via JCasC; matrix-auth for authz; setup
                 wizard suppressed; custom image with plugins.txt)

Infra fixes uncovered while bringing the above online:
- nginx-proxy: bump proxy_buffer_size to 16k so oauth2-proxy callbacks
  don't 502 on the JWT-bearing Set-Cookie header.
- nginx-proxy: add `resolver 127.0.0.11 valid=30s` so service names
  re-resolve after sidecar recreates (was cross-wiring oauth2-proxy
  upstreams after restart).
- jupyterhub: pass --allow-root to the singleuser spawner (hub runs as
  root inside its container; jupyter-server refused root without flag).
- jupyterhub Dockerfile: install jupyterlab + notebook so
  SimpleLocalProcessSpawner has something to launch.
- node-red Dockerfile: install passport-openidconnect into the image
  so settings.js can require() it.
- portainer: pre-seed local admin via --admin-password=<bcrypt-hash>
  so the 5-minute "no admin → lockout" timer can never trigger.
- deploy.sh: restore executable bit (was 644 in repo).

Admin/viewer policy:
- Created realm role `app-admin` in keycloak wbd realm.
- Grafana maps app-admin → Admin (default Viewer).
- Jenkins matrix-auth grants r.de.ren Overall/Administer, authenticated
  users get Overall/Read + Job/Read + View/Read.
- Node-RED: NODERED_ADMIN_USERS env list → permissions "*", others
  ["read"]. (TODO: switch to app-admin realm role.)
- JupyterHub: JUPYTERHUB_ADMIN_USERS env list. (Same TODO.)
- Gitea: r.de.ren pre-created as local admin; OIDC auto-links via email.

Docs:
- README, cloud/README, stacks/oauth2-proxy/README, and per-stack
  READMEs updated to reflect the new state and remove resolved TODOs.
- cloud/.env.example gains all the new OIDC client + cookie-secret keys.
- cloud/README documents the full kcadm realm bootstrap, including the
  hardcoded-audience mapper and post-logout redirect URIs that are
  non-obvious gotchas.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-21 18:34:37 +00:00
parent c91b475cd1
commit 33a794e35d
31 changed files with 833 additions and 98 deletions

View File

@@ -20,7 +20,7 @@ Stacks are pulled into the cloud and site composes via the Compose Spec `include
# Cloud hub (run on the central server) # Cloud hub (run on the central server)
cd cloud cd cloud
cp .env.example .env # fill in real secrets cp .env.example .env # fill in real secrets
docker compose up -d ./deploy.sh # one-shot bring-up + Let's Encrypt + smoke test
# A plant edge (run on the edge gateway at the plant) # A plant edge (run on the edge gateway at the plant)
cd sites/<plant> cd sites/<plant>
@@ -28,6 +28,8 @@ cp .env.example .env
docker compose up -d docker compose up -d
``` ```
After `deploy.sh` finishes, see [`cloud/README.md`](cloud/README.md) for the one-time Keycloak realm bootstrap that wires every app to Keycloak SSO.
## Stacks ## Stacks
| Stack | Purpose | Cloud | Edge | | Stack | Purpose | Cloud | Edge |
@@ -48,6 +50,7 @@ docker compose up -d
| mlflow | ML experiment tracking + registry | ✓ | — | | mlflow | ML experiment tracking + registry | ✓ | — |
| jupyterhub | Multi-user notebook server | ✓ | — | | jupyterhub | Multi-user notebook server | ✓ | — |
| frost | OGC SensorThings API (postgis + dedicated bus) | ✓ | — | | frost | OGC SensorThings API (postgis + dedicated bus) | ✓ | — |
| oauth2-proxy | Keycloak SSO gate (auth_request sidecar) for apps without native OIDC | ✓ | — |
## Sites ## Sites

View File

@@ -14,7 +14,9 @@ ACME_CA_URI=https://acme-v02.api.letsencrypt.org/directory
WG_SERVER_PORT=51820 WG_SERVER_PORT=51820
WG_SERVER_PUBLIC_HOST= WG_SERVER_PUBLIC_HOST=
# Keycloak (admin bootstrap + DB) # Keycloak (admin bootstrap + DB). KEYCLOAK_ADMIN/_PASSWORD are the master-realm
# emergency admin — only used the first time the container starts. The wbd
# realm + per-app OIDC clients are bootstrapped via kcadm in stacks/keycloak/README.md.
KEYCLOAK_ADMIN=admin KEYCLOAK_ADMIN=admin
KEYCLOAK_ADMIN_PASSWORD= KEYCLOAK_ADMIN_PASSWORD=
KEYCLOAK_HOSTNAME=auth.wbd-rd.nl KEYCLOAK_HOSTNAME=auth.wbd-rd.nl
@@ -58,21 +60,65 @@ GITEA_OAUTH_CLIENT_SECRET=
GITEA_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration GITEA_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration
GITEA_MAIL_FROM=gitea@wbd-rd.nl GITEA_MAIL_FROM=gitea@wbd-rd.nl
# Jenkins # Jenkins (authn via Keycloak OIDC + JCasC matrix-auth; see stacks/jenkins/config/jenkins.yaml)
JENKINS_ADMIN_USER=admin JENKINS_ADMIN_USER=admin
JENKINS_ADMIN_PASSWORD= JENKINS_ADMIN_PASSWORD=
JENKINS_OAUTH_CLIENT_ID=jenkins
JENKINS_OAUTH_CLIENT_SECRET=
JENKINS_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration
# MLflow (uses sql backend) # MLflow (uses sql backend; gated by oauth2-proxy at the nginx layer)
MLFLOW_DB_NAME=mlflow MLFLOW_DB_NAME=mlflow
MLFLOW_DB_USER=mlflow MLFLOW_DB_USER=mlflow
MLFLOW_DB_PASSWORD= MLFLOW_DB_PASSWORD=
MLFLOW_OAUTH_CLIENT_ID=mlflow
MLFLOW_OAUTH_CLIENT_SECRET=
MLFLOW_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration
# JupyterHub # JupyterHub (Keycloak OIDC via oauthenticator.GenericOAuthenticator)
# STUB AUTH: DummyAuthenticator. Set a strong shared password — any username + this password logs in. # JUPYTERHUB_ADMIN_USERS: comma-separated emails of users who get the Hub admin UI.
# Replace with Keycloak OIDC (GenericOAuthenticator) before exposing to users beyond the cloud operator.
JUPYTER_NOTEBOOK_IMAGE=jupyter/datascience-notebook:latest JUPYTER_NOTEBOOK_IMAGE=jupyter/datascience-notebook:latest
JUPYTERHUB_ADMIN_USERS= JUPYTERHUB_ADMIN_USERS=
JUPYTERHUB_ADMIN_PASSWORD= JUPYTERHUB_OAUTH_CLIENT_ID=jupyterhub
JUPYTERHUB_OAUTH_CLIENT_SECRET=
JUPYTERHUB_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration
# Node-RED (passport-openidconnect; only the editor is gated)
# NODERED_ADMIN_USERS: comma-separated usernames/emails — get permissions "*".
# Everyone else in the realm gets ["read"] (can view flows, can't deploy).
NODERED_OAUTH_CLIENT_ID=node-red
NODERED_OAUTH_CLIENT_SECRET=
NODERED_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration
NODERED_ADMIN_USERS=
# Grafana OIDC (generic_oauth → Keycloak wbd realm)
GRAFANA_OAUTH_CLIENT_ID=grafana
GRAFANA_OAUTH_CLIENT_SECRET=
GRAFANA_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration
# Portainer-CE — gated by oauth2-proxy at nginx; local admin pre-seeded via bcrypt hash.
# Generate hash with: docker run --rm python:3.13-alpine sh -c "pip install -q bcrypt;
# python -c \"import bcrypt; print(bcrypt.hashpw(b'<password>', bcrypt.gensalt()).decode())\""
# Then double every $ in the hash before pasting (compose interpolates single $).
PORTAINER_ADMIN_PASSWORD=
PORTAINER_ADMIN_PASSWORD_HASH=
PORTAINER_OAUTH_CLIENT_ID=portainer-ce
PORTAINER_OAUTH_CLIENT_SECRET=
PORTAINER_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration
# RabbitMQ management UI — gated by oauth2-proxy at nginx. Users still log into
# the RabbitMQ UI with the local broker credentials above; SSO only gates access
# to the page itself.
RABBITMQ_OAUTH_CLIENT_ID=rabbitmq
RABBITMQ_OAUTH_CLIENT_SECRET=
RABBITMQ_OAUTH_DISCOVERY_URL=https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration
# oauth2-proxy cookie secrets (per-app). 16, 24, or 32 raw bytes — easiest is
# 32 hex chars (i.e. `openssl rand -hex 16`). Different per app to keep cookies
# scoped and to limit blast radius if one ever leaks.
OAUTH2_PROXY_COOKIE_SECRET_MLFLOW=
OAUTH2_PROXY_COOKIE_SECRET_PORTAINER=
OAUTH2_PROXY_COOKIE_SECRET_RABBITMQ=
# FROST (SensorThings — dedicated postgis + internal mosquitto bus, ingressed at sta.wbd-rd.nl) # FROST (SensorThings — dedicated postgis + internal mosquitto bus, ingressed at sta.wbd-rd.nl)
FROST_DB_PASSWORD= FROST_DB_PASSWORD=

View File

@@ -4,10 +4,12 @@ The single central hub. One deployment, internet-facing.
## What runs here ## What runs here
nginx-proxy, wireguard-server, keycloak, portainer, influxdb, grafana, node-red, mqtt, postfix, gitea, jenkins, sql. nginx-proxy, wireguard-server, keycloak, oauth2-proxy, portainer, influxdb, grafana, node-red, rabbitmq, postfix, gitea, jenkins, mlflow, jupyterhub, frost, sql.
See [`../docs/architecture.md`](../docs/architecture.md) for the full network topology and ingress table. See [`../docs/architecture.md`](../docs/architecture.md) for the full network topology and ingress table.
Every human-accessible UI is gated by **Keycloak SSO** (wbd realm). Apps with native OIDC (gitea, grafana, node-red, jenkins, jupyterhub) speak OIDC directly; apps without (mlflow, portainer, rabbitmq) are gated by an `oauth2-proxy` sidecar via nginx `auth_request`. See the [`Keycloak bootstrap`](#keycloak-realm-bootstrap-one-time) section below.
## Run ## Run
```bash ```bash
@@ -43,6 +45,80 @@ The script reissues the cert **only** when the CA in `.env` changes (e.g. stagin
Everything else stays on the internal `app` / `data` / `mgmt` networks. Everything else stays on the internal `app` / `data` / `mgmt` networks.
## Keycloak realm bootstrap (one-time)
After `deploy.sh` succeeds, Keycloak is up at `https://auth.wbd-rd.nl/` with only the master realm. You need to create the `wbd` realm + per-app OIDC clients before SSO works. Driven entirely by `kcadm.sh` inside the keycloak container, so it's reproducible:
```bash
cd cloud
set -a && . ./.env && set +a
KC="docker compose exec -T keycloak /opt/keycloak/bin/kcadm.sh"
# 1. Authenticate against master realm
$KC config credentials --server http://localhost:8080 --realm master \
--user "$KEYCLOAK_ADMIN" --password "$KEYCLOAK_ADMIN_PASSWORD"
# 2. Create the realm
$KC create realms -r master \
-s realm=wbd -s enabled=true -s displayName="WBD R&D" \
-s registrationAllowed=false -s resetPasswordAllowed=true -s rememberMe=true
# 3. Create an OIDC client per app. Pattern:
# clientId = stable lowercase name (matches *_OAUTH_CLIENT_ID in .env)
# redirectUri = the app's documented OIDC callback URL
# Hardcoded audience mapper is critical for oauth2-proxy clients — without
# it the access token's aud will be [realm-management, account] and
# oauth2-proxy will 500 on the callback.
create_client() {
local CID=$1 REDIRECT=$2 SECRET=$3
local ID=$($KC create clients -r wbd \
-s clientId=$CID -s enabled=true -s protocol=openid-connect \
-s publicClient=false -s standardFlowEnabled=true \
-s "redirectUris=[\"$REDIRECT\"]" \
-s "attributes.\"post.logout.redirect.uris\"=\"https://${REDIRECT#https://*/}*\"" \
${SECRET:+-s secret="$SECRET"} -i)
$KC create clients/$ID/protocol-mappers/models -r wbd \
-s name=audience-self -s protocol=openid-connect \
-s protocolMapper=oidc-audience-mapper \
-s "config.\"included.client.audience\"=$CID" \
-s "config.\"access.token.claim\"=true" -s "config.\"id.token.claim\"=false"
echo "$CID -> $ID"
}
create_client gitea "https://git.wbd-rd.nl/user/oauth2/keycloak/callback" "$GITEA_OAUTH_CLIENT_SECRET"
create_client grafana "https://dash.wbd-rd.nl/login/generic_oauth" "$GRAFANA_OAUTH_CLIENT_SECRET"
create_client node-red "https://flow.wbd-rd.nl/auth/strategy/callback/" "$NODERED_OAUTH_CLIENT_SECRET"
create_client jenkins "https://ci.wbd-rd.nl/securityRealm/finishLogin" "$JENKINS_OAUTH_CLIENT_SECRET"
create_client jupyterhub "https://hub.wbd-rd.nl/hub/oauth_callback" "$JUPYTERHUB_OAUTH_CLIENT_SECRET"
create_client mlflow "https://ml.wbd-rd.nl/oauth2/callback" "$MLFLOW_OAUTH_CLIENT_SECRET"
create_client portainer-ce "https://ops.wbd-rd.nl/oauth2/callback" "$PORTAINER_OAUTH_CLIENT_SECRET"
create_client rabbitmq "https://mq.wbd-rd.nl/oauth2/callback" "$RABBITMQ_OAUTH_CLIENT_SECRET"
# 4. Create the realm role used for cross-app admin promotion
$KC create roles -r wbd -s name=app-admin \
-s "description=Grants admin perms across all wbd-realm apps that recognise this role"
# 5. Create the first operator user and grant realm-admin + app-admin
$KC create users -r wbd \
-s username=r.de.ren -s email=r.de.ren@brabantsedelta.nl -s emailVerified=true \
-s firstName=R -s lastName='de Ren' -s enabled=true \
-s 'requiredActions=["UPDATE_PASSWORD"]'
$KC set-password -r wbd --username r.de.ren --new-password '<initial-temp-password>' --temporary
$KC add-roles -r wbd --uusername r.de.ren --cclientid realm-management --rolename realm-admin
$KC add-roles -r wbd --uusername r.de.ren --rolename app-admin
# 6. Final per-app wiring
# - Gitea: run `gitea admin user create --admin --username r.de.ren ...` then
# `gitea admin auth add-oauth --name keycloak ...` (see stacks/gitea/README.md)
# - Everything else picks up its OIDC config from .env on next start.
docker compose restart grafana node-red jenkins jupyterhub
```
After that, every vhost in the smoke-test table redirects unauthenticated users to Keycloak. New teammates: add a user in `https://auth.wbd-rd.nl/admin/wbd/console/` → Users → Add user; default permissions are viewer/read-only across all apps until you also assign them the `app-admin` realm role.
## Adding a new stack ## Adding a new stack
1. Create `stacks/<name>/` with `compose.yml`, `.env.example`, `README.md`. 1. Create `stacks/<name>/` with `compose.yml`, `.env.example`, `README.md`.

View File

@@ -23,6 +23,7 @@ include:
- ../stacks/portainer/compose.yml - ../stacks/portainer/compose.yml
# Identity + VPN # Identity + VPN
- ../stacks/keycloak/compose.yml - ../stacks/keycloak/compose.yml
- ../stacks/oauth2-proxy/compose.yml
- ../stacks/wireguard-server/compose.yml - ../stacks/wireguard-server/compose.yml
# Data # Data
- ../stacks/influxdb/compose.yml - ../stacks/influxdb/compose.yml

0
cloud/deploy.sh Normal file → Executable file
View File

View File

@@ -5,4 +5,33 @@ Dashboard UI. Cloud-side = central observability. Edge-side = plant-local SCADA.
- **Networks**: `app` (reachable from nginx-proxy) + `data` (queries influxdb) - **Networks**: `app` (reachable from nginx-proxy) + `data` (queries influxdb)
- **Volume**: `grafana-data` (sqlite, plugins, sessions) - **Volume**: `grafana-data` (sqlite, plugins, sessions)
- **Config**: `./config/provisioning` (datasources + dashboards as code) — add once SQL/Influx are wired - **Config**: `./config/provisioning` (datasources + dashboards as code) — add once SQL/Influx are wired
- **TODO**: Keycloak OIDC, datasource provisioning, dashboard-as-code baseline - **Hostname**: `dash.wbd-rd.nl`
## Auth
Keycloak OIDC via Grafana's `generic_oauth` provider. All env-driven (see `cloud/.env.example`):
| Env var | Purpose |
|---|---|
| `GRAFANA_OAUTH_CLIENT_ID` / `_SECRET` | Keycloak `grafana` client credentials |
| `GF_AUTH_GENERIC_OAUTH_*` | Provider URLs + claim mapping (set inline in `compose.yml`) |
### Role mapping
`GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH` interprets the Keycloak `realm_access.roles` claim:
| Keycloak realm role | Grafana org role |
|---|---|
| `app-admin` *or* `grafana-admin` | Admin |
| `grafana-editor` | Editor |
| (none of the above) | Viewer |
So a fresh teammate added to the `wbd` realm lands as a Viewer. Grant them `app-admin` (or `grafana-editor`) in Keycloak to promote.
`GF_AUTH_GENERIC_OAUTH_ALLOW_SIGN_UP=true` auto-creates the Grafana user on first OIDC login.
## TODO
- Datasource provisioning (`./config/provisioning/datasources/` — Influx, postgres)
- Dashboard-as-code baseline (`./config/provisioning/dashboards/`)
- Plugin pin list

View File

@@ -13,7 +13,30 @@ services:
GF_SECURITY_ADMIN_USER: ${GRAFANA_ADMIN_USER} GF_SECURITY_ADMIN_USER: ${GRAFANA_ADMIN_USER}
GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_ADMIN_PASSWORD} GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_ADMIN_PASSWORD}
GF_SERVER_ROOT_URL: ${GRAFANA_ROOT_URL:-} GF_SERVER_ROOT_URL: ${GRAFANA_ROOT_URL:-}
# TODO: Keycloak OIDC auth, datasource provisioning, plugin pins # Keycloak OIDC SSO (generic_oauth)
GF_AUTH_GENERIC_OAUTH_ENABLED: "true"
GF_AUTH_GENERIC_OAUTH_NAME: Keycloak
GF_AUTH_GENERIC_OAUTH_ALLOW_SIGN_UP: "true"
GF_AUTH_GENERIC_OAUTH_AUTO_LOGIN: "false"
GF_AUTH_GENERIC_OAUTH_CLIENT_ID: ${GRAFANA_OAUTH_CLIENT_ID:-grafana}
GF_AUTH_GENERIC_OAUTH_CLIENT_SECRET: ${GRAFANA_OAUTH_CLIENT_SECRET}
GF_AUTH_GENERIC_OAUTH_SCOPES: openid email profile
GF_AUTH_GENERIC_OAUTH_AUTH_URL: https://auth.wbd-rd.nl/realms/wbd/protocol/openid-connect/auth
GF_AUTH_GENERIC_OAUTH_TOKEN_URL: https://auth.wbd-rd.nl/realms/wbd/protocol/openid-connect/token
GF_AUTH_GENERIC_OAUTH_API_URL: https://auth.wbd-rd.nl/realms/wbd/protocol/openid-connect/userinfo
GF_AUTH_GENERIC_OAUTH_SIGNOUT_REDIRECT_URL: https://auth.wbd-rd.nl/realms/wbd/protocol/openid-connect/logout
GF_AUTH_GENERIC_OAUTH_USE_PKCE: "true"
# Map Keycloak claims → Grafana profile
GF_AUTH_GENERIC_OAUTH_LOGIN_ATTRIBUTE_PATH: preferred_username
GF_AUTH_GENERIC_OAUTH_EMAIL_ATTRIBUTE_PATH: email
GF_AUTH_GENERIC_OAUTH_NAME_ATTRIBUTE_PATH: name
# Realm role mapping → Grafana org role.
# app-admin or grafana-admin → Admin
# grafana-editor → Editor
# default → Viewer
GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_PATH: "contains(realm_access.roles[*], 'app-admin') && 'Admin' || contains(realm_access.roles[*], 'grafana-admin') && 'Admin' || contains(realm_access.roles[*], 'grafana-editor') && 'Editor' || 'Viewer'"
# Lock down auto-assignment: realm-role decides, not blind admin/editor on signup.
GF_AUTH_GENERIC_OAUTH_ROLE_ATTRIBUTE_STRICT: "false"
networks: networks:
app: app:

View File

@@ -0,0 +1,4 @@
FROM jenkins/jenkins:lts-jdk17
COPY plugins.txt /usr/share/jenkins/ref/plugins.txt
RUN jenkins-plugin-cli --plugin-file /usr/share/jenkins/ref/plugins.txt

View File

@@ -1,7 +1,43 @@
# jenkins # jenkins
CI/CD for EVOLV + R&D pipelines. **Cloud-only stack.** CI/CD for R&D pipelines. **Cloud-only stack.**
- **Hostname**: `ci.wbd-rd.nl`
- **Network**: `app` (UI proxied via nginx) - **Network**: `app` (UI proxied via nginx)
- **Volume**: `jenkins-home` (config + job state + plugins) - **Volume**: `jenkins-home` (config + job state + plugins)
- **TODO**: configuration-as-code (jcasc), Keycloak OIDC, agent strategy (DinD vs SSH vs K8s), pipeline shared libraries - **Image**: built locally (`cloud-jenkins:lts-jdk17`) — upstream Jenkins LTS + pre-installed plugins. See `Dockerfile`, `plugins.txt`.
## Auth + authorization
Keycloak OIDC via the `oic-auth` plugin + JCasC (configuration-as-code). The setup wizard is suppressed (`-Djenkins.install.runSetupWizard=false`) and the security realm + authorization strategy come entirely from `config/jenkins.yaml`.
### Permissions matrix
`config/jenkins.yaml` declares a `globalMatrix` authorization strategy:
| Principal | Permission |
|---|---|
| `USER:r.de.ren` | `Overall/Administer` (full admin) |
| `GROUP:authenticated` | `Overall/Read` + `Job/Read` + `View/Read` |
| anonymous | (no access — bounced to Keycloak) |
To promote another user, add a `USER:Overall/Administer:<username>` line to the matrix. Mapping the Keycloak `app-admin` realm role to admin via a `ROLE:` line (instead of per-user entries) is a TODO.
## Plugins
`plugins.txt`:
```
configuration-as-code
oic-auth
matrix-auth
```
Add more as the team needs them — the `Dockerfile` re-runs `jenkins-plugin-cli` on build, so any addition to `plugins.txt` is picked up next time the image is rebuilt.
## TODO
- Map `app-admin` realm role → `Overall/Administer` via `ROLE:` entries (so admin promotion happens in Keycloak, not in `jenkins.yaml`)
- Agent strategy (DinD vs SSH vs K8s)
- Pipeline shared libraries + Job DSL seed jobs
- Gitea Actions runners (post-cutover) — or use Jenkins gitea webhooks

View File

@@ -1,16 +1,26 @@
# jenkins — CI/CD (cloud only) # jenkins — CI/CD (cloud only)
# Networks: app # Networks: app
# Auth: SSO via Keycloak (wbd realm, jenkins client) configured by JCasC.
services: services:
jenkins: jenkins:
image: jenkins/jenkins:lts-jdk17 build:
context: . # custom image: upstream + configuration-as-code + oic-auth plugins
dockerfile: Dockerfile
image: cloud-jenkins:lts-jdk17
restart: unless-stopped restart: unless-stopped
networks: [app] networks: [app]
environment: environment:
TZ: ${TZ:-Europe/Amsterdam} TZ: ${TZ:-Europe/Amsterdam}
JENKINS_OPTS: --httpPort=8080 JENKINS_OPTS: --httpPort=8080
JAVA_OPTS: -Djenkins.install.runSetupWizard=false
CASC_JENKINS_CONFIG: /var/jenkins_home/casc/jenkins.yaml
# Keycloak OIDC (wbd realm, jenkins client) — read by JCasC at startup.
JENKINS_OAUTH_CLIENT_ID: ${JENKINS_OAUTH_CLIENT_ID:-jenkins}
JENKINS_OAUTH_CLIENT_SECRET: ${JENKINS_OAUTH_CLIENT_SECRET}
volumes: volumes:
- jenkins-home:/var/jenkins_home - jenkins-home:/var/jenkins_home
- ./config:/var/jenkins_home/casc:ro
# TODO: agent strategy (docker-in-docker vs ssh agents vs k8s agents) # TODO: agent strategy (docker-in-docker vs ssh agents vs k8s agents)
networks: networks:

View File

@@ -0,0 +1,40 @@
jenkins:
systemMessage: "WBD R&D Jenkins — SSO via Keycloak (wbd realm)"
numExecutors: 2
scmCheckoutRetryCount: 2
mode: NORMAL
securityRealm:
oic:
clientId: "${JENKINS_OAUTH_CLIENT_ID:-jenkins}"
clientSecret: "${JENKINS_OAUTH_CLIENT_SECRET}"
serverConfiguration:
wellKnown:
wellKnownOpenIDConfigurationUrl: "https://auth.wbd-rd.nl/realms/wbd/.well-known/openid-configuration"
scopesOverride: "openid email profile"
userNameField: "preferred_username"
fullNameFieldName: "name"
emailFieldName: "email"
groupsFieldName: "groups"
logoutFromOpenidProvider: true
postLogoutRedirectUrl: "https://ci.wbd-rd.nl/"
# Permissions:
# - r.de.ren → full admin (Overall/Administer)
# - authenticated → read-only (browse jobs, view builds)
authorizationStrategy:
globalMatrix:
permissions:
- "USER:Overall/Administer:r.de.ren"
- "GROUP:Overall/Read:authenticated"
- "GROUP:Job/Read:authenticated"
- "GROUP:View/Read:authenticated"
crumbIssuer:
standard:
excludeClientIPFromCrumb: false
unclassified:
location:
url: "https://ci.wbd-rd.nl/"
adminAddress: "admin@wbd-rd.nl"

View File

@@ -0,0 +1,3 @@
configuration-as-code:latest
oic-auth:latest
matrix-auth:latest

View File

@@ -0,0 +1,4 @@
FROM jupyterhub/jupyterhub:5
# - oauthenticator: Keycloak OIDC for sign-in
# - jupyterlab / notebook: so SimpleLocalProcessSpawner can launch a per-user server
RUN pip install --no-cache-dir oauthenticator jupyterlab notebook

View File

@@ -1,16 +1,48 @@
# jupyterhub # jupyterhub
Multi-user JupyterHub. Each authenticated user gets their own notebook container via DockerSpawner. **Cloud-only.** Multi-user JupyterHub. **Cloud-only.**
- **Networks**: `app` (UI proxied at `/jupyter` or subdomain) + `mgmt` (Docker socket so JupyterHub can spawn user containers) - **Hostname**: `hub.wbd-rd.nl`
- **Spawned user containers** land on the `cloud-app` network so they can reach mlflow, influxdb (via grafana proxy), rabbitmq - **Networks**: `app` (UI proxied) + `mgmt` (Docker socket — for DockerSpawner once we switch to it)
- **Config**: `config/jupyterhub_config.py` — DockerSpawner setup, authenticator, admin list, resource limits - **Config**: `config/jupyterhub_config.py`
- **Image**: built locally (`cloud-jupyterhub:5`) — upstream JupyterHub + `oauthenticator` + `jupyterlab` + `notebook`. See `Dockerfile`.
## TODO ## Auth
- DockerSpawner config (image, network, user volumes, idle culling) Keycloak OIDC via `oauthenticator.generic.GenericOAuthenticator`. All authenticated users in the `wbd` realm can sign in (`c.GenericOAuthenticator.allow_all = True`).
- Keycloak OAuth via `oauthenticator.generic.GenericOAuthenticator`
- Build a project-specific notebook image with EVOLV libs + mlflow client + InfluxDB client preinstalled Admin promotion is currently driven by the `JUPYTERHUB_ADMIN_USERS` env (comma-separated emails). Switching to a Keycloak realm-role check (`app-admin`) is a TODO.
- Per-user persistent volume mounted at `/home/jovyan/work`
## Spawner
**Current**: `SimpleLocalProcessSpawner` ("simple") — every user's notebook runs as a process inside the hub container itself, sharing the same filesystem. The spawner passes `--allow-root` to `jupyterhub-singleuser` because the hub container runs as root and the singleuser server refuses root without that flag.
This is fine for one or two operators but is **not** the production-shape we want.
### TODO — switch to DockerSpawner
The repo wiring is already half-there:
- The `mgmt` network is mounted
- `/var/run/docker.sock` is mounted into the hub
- `DOCKER_NOTEBOOK_IMAGE` is set in `.env`
To switch, change `jupyterhub_config.py`:
```python
c.JupyterHub.spawner_class = "dockerspawner.DockerSpawner"
c.DockerSpawner.image = os.environ["DOCKER_NOTEBOOK_IMAGE"]
c.DockerSpawner.network_name = os.environ["DOCKER_NETWORK_NAME"]
c.DockerSpawner.notebook_dir = "/home/jovyan/work"
c.DockerSpawner.volumes = {"jupyter-user-{username}": "/home/jovyan/work"}
c.DockerSpawner.remove = True
```
…and add `dockerspawner` to the `Dockerfile` pip install.
## Other TODO
- Switch admin lookup from env-list to `app-admin` realm role
- Per-user persistent volume policy + size limits
- CPU / memory limits per user container - CPU / memory limits per user container
- Cull idle servers (`c.JupyterHub.services` cull-idle pattern) - Idle-server culling (`jupyterhub-idle-culler` service)
- Project-specific notebook image with mlflow/influx/rabbitmq clients preinstalled

View File

@@ -3,7 +3,10 @@
services: services:
jupyterhub: jupyterhub:
image: jupyterhub/jupyterhub:5 build:
context: . # custom image: upstream + oauthenticator (Keycloak OIDC)
dockerfile: Dockerfile
image: cloud-jupyterhub:5
restart: unless-stopped restart: unless-stopped
networks: [app, mgmt] networks: [app, mgmt]
volumes: volumes:
@@ -14,11 +17,11 @@ services:
TZ: ${TZ:-Europe/Amsterdam} TZ: ${TZ:-Europe/Amsterdam}
DOCKER_NOTEBOOK_IMAGE: ${JUPYTER_NOTEBOOK_IMAGE:-jupyter/datascience-notebook:latest} DOCKER_NOTEBOOK_IMAGE: ${JUPYTER_NOTEBOOK_IMAGE:-jupyter/datascience-notebook:latest}
DOCKER_NETWORK_NAME: cloud-app DOCKER_NETWORK_NAME: cloud-app
# Stub auth — DummyAuthenticator gates on this shared password until OIDC is wired. # Keycloak OIDC (wbd realm, jupyterhub client)
JUPYTERHUB_ADMIN_PASSWORD: ${JUPYTERHUB_ADMIN_PASSWORD} JUPYTERHUB_OAUTH_CLIENT_ID: ${JUPYTERHUB_OAUTH_CLIENT_ID:-jupyterhub}
JUPYTERHUB_OAUTH_CLIENT_SECRET: ${JUPYTERHUB_OAUTH_CLIENT_SECRET}
JUPYTERHUB_ADMIN_USERS: ${JUPYTERHUB_ADMIN_USERS} JUPYTERHUB_ADMIN_USERS: ${JUPYTERHUB_ADMIN_USERS}
# TODO: DockerSpawner config in jupyterhub_config.py; Keycloak OAuthAuthenticator; # TODO: DockerSpawner with per-user persistent volumes; CPU/memory limits
# preinstalled libraries; per-user persistent volumes; CPU/memory limits
networks: networks:
app: app:

View File

@@ -1,33 +1,49 @@
# JupyterHub bootstrap config. # JupyterHub config.
# #
# WARNING: this is a STUB. It uses DummyAuthenticator with a single shared # Auth: GenericOAuthenticator → Keycloak (wbd realm, jupyterhub client)
# password and LocalProcessSpawner. It boots, it's password-gated, but it is # Spawner: LocalProcessSpawner ("simple") — single-host runtime.
# NOT the production setup. Before exposing this to anything beyond the # TODO: move to DockerSpawner with per-user persistent volumes once we have
# cloud-host operator, swap to: # per-user resource policies sorted.
# - GenericOAuthenticator pointed at Keycloak (wbd realm, jupyterhub client)
# - DockerSpawner with per-user persistent volumes
# See stacks/jupyterhub/README.md TODO.
import os import os
c = get_config() # noqa: F821 — provided by JupyterHub c = get_config() # noqa: F821 — provided by JupyterHub
# --- Authenticator (stub) ----------------------------------------------------- # --- Authenticator: Keycloak OIDC --------------------------------------------
c.JupyterHub.authenticator_class = "dummy" c.JupyterHub.authenticator_class = "generic-oauth"
c.DummyAuthenticator.password = os.environ["JUPYTERHUB_ADMIN_PASSWORD"]
c.GenericOAuthenticator.client_id = os.environ["JUPYTERHUB_OAUTH_CLIENT_ID"]
c.GenericOAuthenticator.client_secret = os.environ["JUPYTERHUB_OAUTH_CLIENT_SECRET"]
c.GenericOAuthenticator.oauth_callback_url = "https://hub.wbd-rd.nl/hub/oauth_callback"
_realm = "https://auth.wbd-rd.nl/realms/wbd"
c.GenericOAuthenticator.authorize_url = f"{_realm}/protocol/openid-connect/auth"
c.GenericOAuthenticator.token_url = f"{_realm}/protocol/openid-connect/token"
c.GenericOAuthenticator.userdata_url = f"{_realm}/protocol/openid-connect/userinfo"
c.GenericOAuthenticator.logout_redirect_url = f"{_realm}/protocol/openid-connect/logout"
c.GenericOAuthenticator.scope = ["openid", "email", "profile"]
c.GenericOAuthenticator.username_claim = "preferred_username"
c.GenericOAuthenticator.login_service = "Keycloak"
# Any authenticated Keycloak user in the wbd realm is allowed in.
c.GenericOAuthenticator.allow_all = True
# Admin list — comma-separated emails or usernames from .env.
admin_users = os.environ.get("JUPYTERHUB_ADMIN_USERS", "").strip() admin_users = os.environ.get("JUPYTERHUB_ADMIN_USERS", "").strip()
if admin_users: if admin_users:
c.Authenticator.admin_users = {u.strip() for u in admin_users.split(",") if u.strip()} c.Authenticator.admin_users = {u.strip() for u in admin_users.split(",") if u.strip()}
c.Authenticator.allow_all = True # stub: any username, single shared password
# --- Spawner (stub) ----------------------------------------------------------- # --- Spawner -----------------------------------------------------------------
# LocalProcessSpawner runs notebooks as OS processes inside the hub container. # SimpleLocalProcessSpawner runs notebooks as OS processes inside the hub
# Fine for a single operator on the stub; production should use DockerSpawner. # container, which itself runs as root. jupyter-server refuses to run as root
# unless --allow-root is passed; without it the singleuser process exits with
# a CRITICAL log and the hub eventually times out the spawn after 30s.
c.JupyterHub.spawner_class = "simple" c.JupyterHub.spawner_class = "simple"
c.Spawner.default_url = "/lab" c.Spawner.default_url = "/lab"
c.Spawner.args = ["--allow-root"]
# --- Hub posture -------------------------------------------------------------- # --- Hub posture -------------------------------------------------------------
c.JupyterHub.bind_url = "http://:8000" c.JupyterHub.bind_url = "http://:8000"
c.JupyterHub.hub_bind_url = "http://0.0.0.0:8081" c.JupyterHub.hub_bind_url = "http://0.0.0.0:8081"
c.JupyterHub.cleanup_servers = True c.JupyterHub.cleanup_servers = True

View File

@@ -11,52 +11,57 @@ Identity provider for SSO across all R&D services. **Cloud-only** for now (edges
`KC_BOOTSTRAP_ADMIN_USERNAME` + `KC_BOOTSTRAP_ADMIN_PASSWORD` create the master-realm admin on first start. **Change the password immediately after first login** via the admin console. `KC_BOOTSTRAP_ADMIN_USERNAME` + `KC_BOOTSTRAP_ADMIN_PASSWORD` create the master-realm admin on first start. **Change the password immediately after first login** via the admin console.
After deployment: After `cloud/deploy.sh` succeeds, run the kcadm bootstrap from [`cloud/README.md → Keycloak realm bootstrap (one-time)`](../../cloud/README.md#keycloak-realm-bootstrap-one-time). That single script provisions:
- the `wbd` realm
- one OIDC client per app (with redirect URI + hardcoded audience mapper + post-logout URI)
- the `app-admin` realm role
- the first operator user
```bash ## Realm + clients
# 1. Bring it up (sql must be running first)
cd /mnt/d/gitea/RnD/infra/cloud
docker compose up -d sql # if not already up
docker compose up -d keycloak
# 2. Watch logs until you see "Keycloak <version> on JVM started" Every human-accessible app is wired to a Keycloak client in realm `wbd`:
docker compose logs -f keycloak
# 3. Browse https://auth.wbd-rd.nl/ → admin console | Client ID | App | Redirect URI | Mechanism |
# (until cert is bootstrapped, https://<cloud-host>:9443 portainer can show logs)
```
## Realm + clients (TODO — design before deploy day 2)
**Recommended structure**: one realm `wbd` containing all R&D apps as separate clients.
| Client ID | App | Redirect URI | Flow |
|---|---|---|---| |---|---|---|---|
| grafana | Grafana | `https://dash.wbd-rd.nl/login/generic_oauth` | code | | `gitea` | Gitea | `https://git.wbd-rd.nl/user/oauth2/keycloak/callback` | native OIDC |
| gitea | Gitea | `https://git.wbd-rd.nl/user/oauth2/keycloak/callback` | code | | `grafana` | Grafana | `https://dash.wbd-rd.nl/login/generic_oauth` | native OIDC (`generic_oauth`) |
| node-red | Node-RED | `https://flow.wbd-rd.nl/auth/strategy/callback/` | code | | `node-red` | Node-RED | `https://flow.wbd-rd.nl/auth/strategy/callback/` | native OIDC (passport-openidconnect) |
| jenkins | Jenkins | `https://ci.wbd-rd.nl/securityRealm/finishLogin` | code | | `jenkins` | Jenkins | `https://ci.wbd-rd.nl/securityRealm/finishLogin` | native OIDC (`oic-auth` plugin via JCasC) |
| jupyterhub | JupyterHub | `https://hub.wbd-rd.nl/hub/oauth_callback` | code | | `jupyterhub` | JupyterHub | `https://hub.wbd-rd.nl/hub/oauth_callback` | native OIDC (`oauthenticator.GenericOAuthenticator`) |
| mlflow | MLflow (via oauth2-proxy) | `https://ml.wbd-rd.nl/oauth2/callback` | code | | `mlflow` | MLflow | `https://ml.wbd-rd.nl/oauth2/callback` | oauth2-proxy + nginx `auth_request` |
| portainer-ce | Portainer (via oauth2-proxy) | `https://ops.wbd-rd.nl/oauth2/callback` | code | | `portainer-ce` | Portainer | `https://ops.wbd-rd.nl/oauth2/callback` | oauth2-proxy + nginx `auth_request` |
| `rabbitmq` | RabbitMQ UI | `https://mq.wbd-rd.nl/oauth2/callback` | oauth2-proxy + nginx `auth_request` |
Apps **without native OIDC** (mlflow, portainer-CE) sit behind an `oauth2-proxy` sidecar that nginx `auth_request`s to. That's a TODO stack we'll add when we wire up mlflow / portainer SSO. ### Required client config (gotchas)
- **Hardcoded audience mapper** (`oidc-audience-mapper` with `included.client.audience=<clientId>`) on every client. Without this, the access-token `aud` is `[realm-management, account]` and oauth2-proxy clients return HTTP 500 at callback time. The kcadm bootstrap script in `cloud/README.md` creates this mapper for every client.
- **Valid post-logout redirect URIs** must be set (Keycloak 24+ requirement) — `https://<app-host>/*`. Otherwise logout returns "Invalid redirect URI".
- **`directAccessGrantsEnabled=false`** unless you specifically need ROPC for that client.
## Realm roles
- `app-admin` — granted to operators who should get full admin in every app. Each app's config maps this role to its own admin role:
- Grafana: `app-admin``Admin`
- Jenkins: matrix-auth grants `Overall/Administer` to specific usernames (not the role itself yet — TODO)
- Node-RED / JupyterHub: env-var lists are still username-based; switching to role-based is a TODO
- Anyone without `app-admin` gets viewer / read-only.
## Realm-as-code ## Realm-as-code
Drop exported realm JSON into `config/realms/`. On first start, Keycloak imports anything in `/opt/keycloak/data/import/` if `KC_IMPORT_REALM_DIR` is set or if you run `kc.sh import` manually. Recommended workflow: Once you're happy with the manual setup, export the realm so future deploys auto-import:
1. Configure the realm by hand once in the UI ```bash
2. `docker compose exec keycloak /opt/keycloak/bin/kc.sh export --dir /opt/keycloak/data/import --realm wbd` docker compose exec keycloak /opt/keycloak/bin/kc.sh \
3. Commit the exported file under `stacks/keycloak/config/realms/` export --dir /opt/keycloak/data/import --realm wbd
4. Subsequent fresh deploys auto-import ```
Commit the resulting JSON to `stacks/keycloak/config/realms/wbd.json`. Keycloak imports anything in `/opt/keycloak/data/import/` on first start.
## TODO ## TODO
- Realm bootstrap script (provision `wbd` realm + clients above) - Export realm config → commit `config/realms/wbd.json`
- Theme: WBD branding (logo, colors) - Theme: WBD branding (logo, colors)
- User federation (LDAP from corporate AD, if applicable) - User federation (LDAP from corporate AD, if applicable)
- 2FA policy - 2FA policy
- Session / token lifetimes per client - Session / token lifetimes per client
- oauth2-proxy stack for apps without native OIDC - Switch Node-RED + JupyterHub admin lookup from env-var lists to `app-admin` realm role checks (Grafana already does this)
- Realm export → `config/realms/wbd.json` committed

View File

@@ -2,11 +2,17 @@
MLflow tracking server + model registry. Used by data scientists running experiments from JupyterHub or local laptops. **Cloud-only.** MLflow tracking server + model registry. Used by data scientists running experiments from JupyterHub or local laptops. **Cloud-only.**
- **Networks**: `app` (UI on port 5000, reverse-proxied at `/mlflow` or subdomain) + `data` (postgres backend in `sql`) - **Networks**: `app` (UI on port 5000, reverse-proxied at `ml.wbd-rd.nl`) + `data` (postgres backend in `sql`)
- **Backend store**: postgres database `mlflow` — must be provisioned by `sql/config/init.d/` - **Backend store**: postgres database `mlflow` — must be provisioned by `sql/config/init.d/`
- **Artifact store**: local volume `mlflow-artifacts`. Switch to S3/MinIO when artifact volume grows beyond a few GB. - **Artifact store**: local volume `mlflow-artifacts`. Switch to S3/MinIO when artifact volume grows beyond a few GB.
- **TODO**: - **Image**: built locally (`cloud-mlflow:v2.18.0`) — upstream MLflow + `psycopg2-binary` for postgres backend. See `Dockerfile`.
- Provision `mlflow` DB + role in `sql` init scripts
- Keycloak OIDC via nginx `auth_request` (MLflow has no native auth — must front-end it) ## Auth
- MinIO sidecar for S3-compatible artifact store
- Retention / cleanup policy for stale runs MLflow has no native auth. It's gated by the `oauth2-proxy-mlflow` sidecar (see `stacks/oauth2-proxy/`) via nginx `auth_request`. Anyone in the Keycloak `wbd` realm can reach MLflow; there is no admin/viewer split inside MLflow itself.
## TODO
- MinIO sidecar for S3-compatible artifact store
- Retention / cleanup policy for stale runs
- Per-experiment ACLs via MLflow's own auth plugin (released in 2.x as an alpha) once we want finer-grained access than "in the realm"

View File

@@ -6,10 +6,39 @@ server {
ssl_certificate /etc/letsencrypt/live/infra/fullchain.pem; ssl_certificate /etc/letsencrypt/live/infra/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/infra/privkey.pem; ssl_certificate_key /etc/letsencrypt/live/infra/privkey.pem;
# Large model artifact uploads
client_max_body_size 5G; client_max_body_size 5G;
location /oauth2/ {
proxy_pass http://oauth2-proxy-mlflow:4180;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Auth-Request-Redirect $request_uri;
}
location = /oauth2/auth {
internal;
proxy_pass http://oauth2-proxy-mlflow:4180;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Original-URI $request_uri;
proxy_set_header Content-Length "";
proxy_pass_request_body off;
}
location / { location / {
auth_request /oauth2/auth;
error_page 401 = /oauth2/sign_in;
auth_request_set $auth_cookie $upstream_http_set_cookie;
add_header Set-Cookie $auth_cookie;
auth_request_set $auth_user $upstream_http_x_auth_request_email;
proxy_set_header X-Forwarded-User $auth_user;
proxy_pass http://mlflow:5000; proxy_pass http://mlflow:5000;
proxy_set_header Host $host; proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Real-IP $remote_addr;

View File

@@ -6,8 +6,37 @@ server {
ssl_certificate /etc/letsencrypt/live/infra/fullchain.pem; ssl_certificate /etc/letsencrypt/live/infra/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/infra/privkey.pem; ssl_certificate_key /etc/letsencrypt/live/infra/privkey.pem;
location /oauth2/ {
proxy_pass http://oauth2-proxy-portainer:4180;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Auth-Request-Redirect $request_uri;
}
location = /oauth2/auth {
internal;
proxy_pass http://oauth2-proxy-portainer:4180;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Original-URI $request_uri;
proxy_set_header Content-Length "";
proxy_pass_request_body off;
}
location / { location / {
# Portainer CE 2.x exposes HTTPS only (self-signed cert inside the container) auth_request /oauth2/auth;
error_page 401 = /oauth2/sign_in;
auth_request_set $auth_cookie $upstream_http_set_cookie;
add_header Set-Cookie $auth_cookie;
auth_request_set $auth_user $upstream_http_x_auth_request_email;
proxy_set_header X-Forwarded-User $auth_user;
proxy_pass https://portainer:9443; proxy_pass https://portainer:9443;
proxy_ssl_verify off; proxy_ssl_verify off;

View File

@@ -6,7 +6,34 @@ server {
ssl_certificate /etc/letsencrypt/live/infra/fullchain.pem; ssl_certificate /etc/letsencrypt/live/infra/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/infra/privkey.pem; ssl_certificate_key /etc/letsencrypt/live/infra/privkey.pem;
location /oauth2/ {
proxy_pass http://oauth2-proxy-rabbitmq:4180;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Auth-Request-Redirect $request_uri;
}
location = /oauth2/auth {
internal;
proxy_pass http://oauth2-proxy-rabbitmq:4180;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_set_header X-Original-URI $request_uri;
proxy_set_header Content-Length "";
proxy_pass_request_body off;
}
location / { location / {
auth_request /oauth2/auth;
error_page 401 = /oauth2/sign_in;
auth_request_set $auth_cookie $upstream_http_set_cookie;
add_header Set-Cookie $auth_cookie;
proxy_pass http://rabbitmq:15672; proxy_pass http://rabbitmq:15672;
proxy_set_header Host $host; proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Real-IP $remote_addr;

View File

@@ -33,6 +33,16 @@ http {
ssl_session_timeout 1d; ssl_session_timeout 1d;
ssl_session_tickets off; ssl_session_tickets off;
# Docker's embedded DNS, so we re-resolve service names if containers get
# recreated (otherwise nginx caches IPs at startup and 502s after a restart).
resolver 127.0.0.11 valid=30s ipv6=off;
# Headroom for proxied responses that ship JWTs / large cookies (oauth2-proxy
# callbacks, Jenkins, Keycloak). Default 4*8k is too small.
proxy_buffer_size 16k;
proxy_buffers 8 16k;
proxy_busy_buffers_size 32k;
include /etc/nginx/conf.d/*.conf; include /etc/nginx/conf.d/*.conf;
} }

View File

@@ -0,0 +1,4 @@
FROM nodered/node-red:4.1
USER root
RUN npm install --omit=dev --prefix /usr/src/node-red passport-openidconnect
USER node-red

View File

@@ -2,7 +2,36 @@
Node-RED flow editor + runtime. Used at both cloud and edge. Node-RED flow editor + runtime. Used at both cloud and edge.
- **UI**: internal port 1880 → reverse-proxied at `/node-red` (or subdomain) - **Hostname**: `flow.wbd-rd.nl`
- **Networks**: `app` - **Networks**: `app`
- **Volumes**: `node-red-data` (flows + credentials) - **Volumes**: `node-red-data` (flows + credentials)
- **TODO**: Keycloak OIDC integration, preinstalled module list, EVOLV nodes installation - **Image**: built locally (`cloud-node-red:4.1`) — upstream Node-RED + `passport-openidconnect`. See `Dockerfile`.
- **Config**: `config/settings.js` (mounted to `/data/settings.js`)
## Auth
The **editor** (not the runtime HTTP-in / dashboard nodes) is gated by Keycloak OIDC via `passport-openidconnect`.
### Implementation notes
Node-RED's `adminAuth.type=strategy` mode doesn't give the passport strategy access to `req.session`, so we can't use passport-openidconnect's default `SessionStore`. `config/settings.js` provides an **in-memory state store** keyed by a random handle (10-minute TTL) — handles are issued at the `/auth/strategy` redirect and consumed at the callback. State survives across the OAuth round trip without needing express-session.
Verify returns `{username: <preferred_username>}`; the `adminAuth.users(username)` resolver then expands that to a Node-RED user object with permissions.
### Permissions
Set `NODERED_ADMIN_USERS` (comma-separated usernames or emails) in `.env`:
| User | permissions |
|---|---|
| listed in `NODERED_ADMIN_USERS` | `"*"` (full editor) |
| any other authenticated realm user | `["read"]` (can view flows, cannot deploy) |
Switching this to a Keycloak `app-admin` realm-role check is a TODO.
## TODO
- Switch admin lookup from env-list to `app-admin` realm role
- Gate runtime HTTP-in / dashboard routes with `httpNodeAuth` if/when we expose them publicly
- Preinstalled module list (`packages.json`) — mqtt, influxdb, postgres clients
- Set `credentialSecret` in settings.js (currently auto-generated; rotating it forces re-entering creds)

View File

@@ -4,14 +4,23 @@
services: services:
node-red: node-red:
image: nodered/node-red:4.1 build:
context: . # custom image: upstream + passport-openidconnect for Keycloak SSO
dockerfile: Dockerfile
image: cloud-node-red:4.1
restart: unless-stopped restart: unless-stopped
networks: [app] networks: [app]
volumes: volumes:
- node-red-data:/data - node-red-data:/data
- ./config/settings.js:/data/settings.js:ro
environment: environment:
TZ: ${TZ:-Europe/Amsterdam} TZ: ${TZ:-Europe/Amsterdam}
# TODO: Keycloak OIDC adapter; preinstalled modules; CONTRIB allow-list # Keycloak OIDC (wbd realm, node-red client)
NODERED_OAUTH_CLIENT_ID: ${NODERED_OAUTH_CLIENT_ID:-node-red}
NODERED_OAUTH_CLIENT_SECRET: ${NODERED_OAUTH_CLIENT_SECRET}
# Comma-separated usernames/emails. These get permissions "*". Anyone else
# in the Keycloak realm gets "read" (cannot deploy).
NODERED_ADMIN_USERS: ${NODERED_ADMIN_USERS}
networks: networks:
app: app:

View File

@@ -0,0 +1,110 @@
// Node-RED config — Keycloak OIDC editor auth.
// Only the editor (/red, /flow) is gated; runtime HTTP-in / dashboard routes
// stay open by default. Lock those down separately with httpNodeAuth.
// Admins listed here get permissions "*". Everyone else (any authenticated
// Keycloak user) gets ["read"] — they can view flows but cannot deploy.
const NODERED_ADMINS = (process.env.NODERED_ADMIN_USERS || "")
.split(",")
.map(s => s.trim().toLowerCase())
.filter(Boolean);
function permissionsFor(username) {
return NODERED_ADMINS.includes((username || "").toLowerCase()) ? "*" : ["read"];
}
// In-memory state store for the OIDC handshake. Node-RED's adminAuth gives us
// req.session, but for safety (and because we don't actually need anything
// beyond CSRF on the state param), we keep state in a process-local Map keyed
// by a random handle (sent as the `state` query param).
const _crypto = require("crypto");
const _stateStore = new Map();
const _STATE_TTL_MS = 10 * 60 * 1000;
setInterval(function() {
const now = Date.now();
for (const [k, v] of _stateStore) {
if (v.expires < now) _stateStore.delete(k);
}
}, 60 * 1000).unref();
module.exports = {
uiPort: process.env.PORT || 1880,
adminAuth: {
type: "strategy",
strategy: {
name: "openidconnect",
label: "Sign in with Keycloak",
icon: "fa-sign-in",
strategy: require("passport-openidconnect").Strategy,
options: {
issuer: "https://auth.wbd-rd.nl/realms/wbd",
authorizationURL: "https://auth.wbd-rd.nl/realms/wbd/protocol/openid-connect/auth",
tokenURL: "https://auth.wbd-rd.nl/realms/wbd/protocol/openid-connect/token",
userInfoURL: "https://auth.wbd-rd.nl/realms/wbd/protocol/openid-connect/userinfo",
clientID: process.env.NODERED_OAUTH_CLIENT_ID || "node-red",
clientSecret: process.env.NODERED_OAUTH_CLIENT_SECRET,
callbackURL: "https://flow.wbd-rd.nl/auth/strategy/callback/",
scope: ["openid", "email", "profile"],
proxy: true,
store: {
store: function(req, ctx, appState, meta, cb) {
const handle = _crypto.randomBytes(18).toString("hex");
_stateStore.set(handle, {
ctx: ctx || {},
appState: appState,
expires: Date.now() + _STATE_TTL_MS,
});
cb(null, handle);
},
verify: function(req, handle, cb) {
const entry = _stateStore.get(handle);
if (!entry) return cb(null, false, { message: "Unknown auth state" });
_stateStore.delete(handle);
if (entry.expires < Date.now()) {
return cb(null, false, { message: "Expired auth state" });
}
cb(null, entry.ctx, entry.appState);
},
},
// Standard 3-arg verify; pass the username forward so users(username) can build the user.
verify: function(issuer, profile, done) {
const json = (profile && profile._json) || {};
const username = (
(profile && profile.username)
|| json.preferred_username
|| (profile && profile.emails && profile.emails[0] && profile.emails[0].value)
|| json.email
|| (profile && profile.id)
|| "unknown"
).toString().toLowerCase();
done(null, { username: username });
},
},
},
// Resolve a username (as produced by verify above) into a full Node-RED user.
// Anyone the realm vouches for is allowed in; admins from NODERED_ADMIN_USERS get "*",
// everyone else gets read-only.
users: function(username) {
return Promise.resolve({
username: (username || "").toLowerCase(),
permissions: permissionsFor(username),
});
},
},
editorTheme: {
projects: { enabled: false },
page: { title: "WBD Node-RED" },
},
functionGlobalContext: {},
logging: {
console: {
level: "info",
metrics: false,
audit: false,
},
},
};

View File

@@ -0,0 +1,73 @@
# oauth2-proxy
Keycloak SSO gate for apps **without native OIDC**: `mlflow`, `portainer-ce`, `rabbitmq`. One sidecar container per protected vhost. nginx-proxy's vhost for each gates user requests with `auth_request /oauth2/auth`, and forwards `/oauth2/*` paths to the sidecar for the OIDC handshake.
Cloud-only. Each sidecar binds to port 4180 on the `app` network so nginx can reach it.
## Files
```
stacks/oauth2-proxy/
├── compose.yml # three services: oauth2-proxy-mlflow, -portainer, -rabbitmq
└── README.md
```
## How a request flows
1. Browser → `https://ml.wbd-rd.nl/` (no cookie)
2. nginx `auth_request /oauth2/auth` subrequest → oauth2-proxy → **401**
3. nginx `error_page 401 = /oauth2/sign_in` → oauth2-proxy → **302** to Keycloak
4. User authenticates at Keycloak → redirected to `https://ml.wbd-rd.nl/oauth2/callback?code=…`
5. oauth2-proxy exchanges the code, sets `_oauth2_<app>` cookie, redirects to the original URL
6. Browser retries `/`, this time `/oauth2/auth` returns **202**, request proxies through to mlflow
## Required Keycloak setup per client
Each client must have:
- Standard flow enabled, confidential client (`publicClient=false`).
- Redirect URI exactly: `https://<app-host>/oauth2/callback`
- **Hardcoded audience mapper** so the access token's `aud` claim includes the client_id. Without this, oauth2-proxy rejects the callback with HTTP 500 because the default Keycloak `aud` is `[realm-management, account]` — see `cloud/README.md` for the kcadm bootstrap script.
## nginx vhost shape
```nginx
server {
listen 443 ssl; server_name <app>.wbd-rd.nl;
# … cert directives …
location /oauth2/ {
proxy_pass http://oauth2-proxy-<app>:4180;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Auth-Request-Redirect $request_uri;
}
location = /oauth2/auth {
internal;
proxy_pass http://oauth2-proxy-<app>:4180;
proxy_set_header Host $host;
proxy_set_header X-Original-URI $request_uri;
proxy_pass_request_body off;
proxy_set_header Content-Length "";
}
location / {
auth_request /oauth2/auth;
error_page 401 = /oauth2/sign_in;
auth_request_set $auth_cookie $upstream_http_set_cookie;
add_header Set-Cookie $auth_cookie;
proxy_pass http://<app-upstream>;
# … app-specific headers …
}
}
```
## Caveats
- **No fine-grained authorization.** oauth2-proxy only enforces "is this Keycloak user authenticated"; restricting to a subset (e.g. only `app-admin`) is a separate config (`OAUTH2_PROXY_ALLOWED_GROUPS` / `OAUTH2_PROXY_ALLOWED_EMAIL_DOMAINS`).
- **Cookies are HUGE.** oauth2-proxy embeds the JWT + refresh token in the session cookie. Without raising `proxy_buffer_size` to ≥ 16k in nginx, the callback returns 502 "upstream sent too big header". The nginx-proxy stack already sets this globally.
- **DNS caching on nginx restart.** When oauth2-proxy containers are recreated they get new Docker bridge IPs. nginx will keep using the old IPs until reloaded. The nginx-proxy stack now ships `resolver 127.0.0.11 valid=30s;` so this self-heals on the resolver TTL.
- **Cookie secrets must be exactly 16, 24, or 32 bytes** (raw, not base64). `openssl rand -hex 16` gives 32 hex chars = the right shape.
## TODO
- Restrict access per app via `OAUTH2_PROXY_ALLOWED_GROUPS=app-admin` (or similar) once we have proper realm groups
- Consider consolidating to a single oauth2-proxy with a wildcard cookie domain `.wbd-rd.nl` once we trust the shared-cookie tradeoff

View File

@@ -0,0 +1,58 @@
# oauth2-proxy — Keycloak SSO gate for apps without native OIDC.
# One container per protected vhost; each presents itself on the `app` network
# so nginx-proxy can hit it via auth_request.
x-oauth2-proxy-common: &oauth2-proxy-common
image: quay.io/oauth2-proxy/oauth2-proxy:v7.6.0
restart: unless-stopped
networks: [app]
environment: &oauth2-proxy-env
OAUTH2_PROXY_PROVIDER: keycloak-oidc
OAUTH2_PROXY_OIDC_ISSUER_URL: https://auth.wbd-rd.nl/realms/wbd
OAUTH2_PROXY_HTTP_ADDRESS: 0.0.0.0:4180
OAUTH2_PROXY_REVERSE_PROXY: "true"
OAUTH2_PROXY_PASS_ACCESS_TOKEN: "true"
OAUTH2_PROXY_PASS_AUTHORIZATION_HEADER: "true"
OAUTH2_PROXY_SET_XAUTHREQUEST: "true"
OAUTH2_PROXY_SET_AUTHORIZATION_HEADER: "true"
OAUTH2_PROXY_EMAIL_DOMAINS: "*"
OAUTH2_PROXY_SKIP_PROVIDER_BUTTON: "true"
OAUTH2_PROXY_UPSTREAM: static://202
OAUTH2_PROXY_COOKIE_SECURE: "true"
OAUTH2_PROXY_COOKIE_SAMESITE: lax
OAUTH2_PROXY_WHITELIST_DOMAINS: ".wbd-rd.nl"
TZ: ${TZ:-Europe/Amsterdam}
services:
oauth2-proxy-mlflow:
<<: *oauth2-proxy-common
environment:
<<: *oauth2-proxy-env
OAUTH2_PROXY_CLIENT_ID: ${MLFLOW_OAUTH_CLIENT_ID:-mlflow}
OAUTH2_PROXY_CLIENT_SECRET: ${MLFLOW_OAUTH_CLIENT_SECRET}
OAUTH2_PROXY_COOKIE_SECRET: ${OAUTH2_PROXY_COOKIE_SECRET_MLFLOW}
OAUTH2_PROXY_REDIRECT_URL: https://ml.wbd-rd.nl/oauth2/callback
OAUTH2_PROXY_COOKIE_NAME: _oauth2_mlflow
oauth2-proxy-portainer:
<<: *oauth2-proxy-common
environment:
<<: *oauth2-proxy-env
OAUTH2_PROXY_CLIENT_ID: ${PORTAINER_OAUTH_CLIENT_ID:-portainer-ce}
OAUTH2_PROXY_CLIENT_SECRET: ${PORTAINER_OAUTH_CLIENT_SECRET}
OAUTH2_PROXY_COOKIE_SECRET: ${OAUTH2_PROXY_COOKIE_SECRET_PORTAINER}
OAUTH2_PROXY_REDIRECT_URL: https://ops.wbd-rd.nl/oauth2/callback
OAUTH2_PROXY_COOKIE_NAME: _oauth2_portainer
oauth2-proxy-rabbitmq:
<<: *oauth2-proxy-common
environment:
<<: *oauth2-proxy-env
OAUTH2_PROXY_CLIENT_ID: ${RABBITMQ_OAUTH_CLIENT_ID:-rabbitmq}
OAUTH2_PROXY_CLIENT_SECRET: ${RABBITMQ_OAUTH_CLIENT_SECRET}
OAUTH2_PROXY_COOKIE_SECRET: ${OAUTH2_PROXY_COOKIE_SECRET_RABBITMQ}
OAUTH2_PROXY_REDIRECT_URL: https://mq.wbd-rd.nl/oauth2/callback
OAUTH2_PROXY_COOKIE_NAME: _oauth2_rabbitmq
networks:
app:

View File

@@ -4,13 +4,25 @@ Docker container management UI — the "operator console" for cloud and edge.
## Access ## Access
Portainer ingresses through nginx-proxy: `https://ops.wbd-rd.nl/`. No host port is published by default. Portainer ingresses through nginx-proxy: `https://ops.wbd-rd.nl/`. No host port is published by default. For emergency ops (nginx down, etc.), uncomment the `ports:` block in `compose.yml` and `docker compose up -d portainer` to expose `:9443` and `:8000` directly.
For emergency ops (nginx down, etc.), uncomment the `ports:` block in `compose.yml` and `docker compose up -d portainer` to expose `:9443` and `:8000` directly. ## Auth
## First-run admin Two layers:
On first visit, Portainer prompts for an admin username and password. Use a long random password; this account is break-glass — your daily login should come via Keycloak OIDC once that gate is wired (see TODO). 1. **Keycloak SSO gate** — the nginx vhost calls `auth_request` against `oauth2-proxy-portainer` (Keycloak `wbd` realm, client `portainer-ce`). Anyone not in the realm is bounced to Keycloak login.
2. **Portainer local admin** — once past the SSO gate, Portainer asks for its own credentials. Portainer-CE has no native OIDC, so there's no way to skip this second step on CE. The admin user is **pre-seeded** at boot via `--admin-password=<bcrypt-hash>` (see compose), with the hash stored in `.env` as `PORTAINER_ADMIN_PASSWORD_HASH`.
> Pre-seeding the admin bypasses Portainer's "5-minute setup window or lockout" behavior on fresh installs.
### Generating the bcrypt hash
```bash
docker run --rm python:3.13-alpine sh -c \
"pip install -q bcrypt && python -c \"import bcrypt; print(bcrypt.hashpw(b'<your-password>', bcrypt.gensalt()).decode())\""
```
Double every `$` in the resulting hash before pasting into `.env` (`$2b$``$$2b$$`) — Compose interpolates single `$`.
## Edge-agent topology ## Edge-agent topology
@@ -19,7 +31,8 @@ Port `8000` accepts reverse tunnels from edge sites running the `portainer/agent
## Networks ## Networks
- **mgmt** — Docker management plane - **mgmt** — Docker management plane
- **Docker socket**: read-only mount; *effectively root-equivalent* on the host. Front with Keycloak SSO as soon as auth is wired. - **app** — nginx-proxy reaches portainer:9443 from here
- **Docker socket**: read-only mount; *effectively root-equivalent* on the host. The Keycloak SSO gate is what limits who can talk to Portainer.
## Volumes ## Volumes
@@ -27,6 +40,6 @@ Port `8000` accepts reverse tunnels from edge sites running the `portainer/agent
## TODO ## TODO
- Keycloak OIDC auth (Portainer CE needs a frontend gate; Business Edition has native OIDC if budget allows)
- Edge-agent provisioning workflow per site (agent secret, registration call) - Edge-agent provisioning workflow per site (agent secret, registration call)
- Disable self-signed `:9443` access after nginx-proxy goes live (operational hygiene) - Map the `app-admin` Keycloak realm role to a Portainer team via the OAuth2 team-sync API (CE supports this) so promotion doesn't require manual Portainer-side admin clicks
- Drop the direct `:9443` host port permanently (currently still commented but available)

View File

@@ -2,6 +2,11 @@
# Networks: mgmt (docker socket plane) + app (nginx-proxy reaches HTTPS upstream) # Networks: mgmt (docker socket plane) + app (nginx-proxy reaches HTTPS upstream)
# Ingress: nginx-proxy → portainer:9443 (self-signed upstream cert) → ops.wbd-rd.nl # Ingress: nginx-proxy → portainer:9443 (self-signed upstream cert) → ops.wbd-rd.nl
# #
# Auth: gated by oauth2-proxy (Keycloak wbd realm) AT the nginx layer; Portainer
# itself uses its local admin account (Portainer-CE has no native OIDC).
# The admin user is pre-seeded via --admin-password so the 5-minute setup-window
# lockout that Portainer applies to fresh installs can never trigger.
#
# Direct :9443 host access is intentionally NOT published anymore — re-enable # Direct :9443 host access is intentionally NOT published anymore — re-enable
# only for emergency ops by uncommenting the `ports:` block below. # only for emergency ops by uncommenting the `ports:` block below.
@@ -10,6 +15,8 @@ services:
image: portainer/portainer-ce:2.21.4 image: portainer/portainer-ce:2.21.4
restart: unless-stopped restart: unless-stopped
networks: [mgmt, app] networks: [mgmt, app]
command:
- --admin-password=${PORTAINER_ADMIN_PASSWORD_HASH}
# ports: # ports:
# - "9443:9443" # HTTPS UI direct access (emergency ops only) # - "9443:9443" # HTTPS UI direct access (emergency ops only)
# - "8000:8000" # Edge-agent reverse tunnel (open when wiring edges) # - "8000:8000" # Edge-agent reverse tunnel (open when wiring edges)

View File

@@ -6,7 +6,7 @@ Used at both cloud and edge. The `mosquitto` stack is reserved for the FROST Sen
- **Network**: `app` — no published port - **Network**: `app` — no published port
- **External MQTT** clients reach `nginx-proxy:8883` (cloud) which stream-proxies to `rabbitmq:1883` internally. Edge brokers are internal-only. - **External MQTT** clients reach `nginx-proxy:8883` (cloud) which stream-proxies to `rabbitmq:1883` internally. Edge brokers are internal-only.
- **Management UI**: port 15672 → reverse-proxied through nginx-proxy - **Management UI**: port 15672 → reverse-proxied through nginx-proxy at `mq.wbd-rd.nl`, gated by `oauth2-proxy-rabbitmq` (Keycloak `wbd` realm). Users still log into the RabbitMQ UI itself with the local broker credentials (`RABBITMQ_USER` / `RABBITMQ_PASSWORD`); the SSO gate only restricts who can reach the page. Wiring RabbitMQ's native OAuth2 plugin so the UI accepts Keycloak tokens directly is a TODO.
- **Plugins to enable** in `config/enabled_plugins`: - **Plugins to enable** in `config/enabled_plugins`:
``` ```
[rabbitmq_management, rabbitmq_mqtt, rabbitmq_web_mqtt]. [rabbitmq_management, rabbitmq_mqtt, rabbitmq_web_mqtt].