feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site

Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks.

Locked decisions
- sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning
- nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO).
  Chose stock over nginxproxy/nginx-proxy because stream{} is required for
  MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883.
- gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published.

MQTT split
- Remove stacks/mqtt placeholder.
- Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin),
  used at both cloud and edge. External MQTT clients reach cloud broker
  via nginx stream-proxy on 8883.
- Add stacks/mosquitto — reserved for the FROST (SensorThings) stack
  only. Cloud-only. Internal to its own stack; no external ingress.

ML / notebooks (cloud-only)
- stacks/mlflow — experiment tracking + model registry. Postgres backend
  on sql stack; local volume for artifacts (S3/MinIO is a TODO).
- stacks/jupyterhub — multi-user notebook server. DockerSpawner via
  mounted docker.sock; users spawn into cloud-app network so they can
  reach mlflow, influxdb (via grafana), rabbitmq.

Sites
- sites/gemaal1 — first edge deployment scaffold. Site-local override
  template for binding nginx to PLANT_LAN_IP.

Docs
- README + docs/architecture.md updated: stacks table now lists 15 stacks,
  ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy
  section locked, MQTT-split section added, Gitea HTTPS-only noted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
znetsixe
2026-05-21 13:22:46 +02:00
parent 8ab9061983
commit 2f5e3b4183
30 changed files with 492 additions and 116 deletions

12
stacks/mlflow/README.md Normal file
View File

@@ -0,0 +1,12 @@
# mlflow
MLflow tracking server + model registry. Used by data scientists running experiments from JupyterHub or local laptops. **Cloud-only.**
- **Networks**: `app` (UI on port 5000, reverse-proxied at `/mlflow` or subdomain) + `data` (postgres backend in `sql`)
- **Backend store**: postgres database `mlflow` — must be provisioned by `sql/config/init.d/`
- **Artifact store**: local volume `mlflow-artifacts`. Switch to S3/MinIO when artifact volume grows beyond a few GB.
- **TODO**:
- Provision `mlflow` DB + role in `sql` init scripts
- Keycloak OIDC via nginx `auth_request` (MLflow has no native auth — must front-end it)
- MinIO sidecar for S3-compatible artifact store
- Retention / cleanup policy for stale runs