feat: SQL=postgres, nginx+certbot, MQTT split, ML stacks, gitea HTTPS-only, gemaal1 site
Round-2 changes locking in scaffold-phase decisions and adding ML/notebook stacks.
Locked decisions
- sql: postgres 16-alpine (was TBD); init.d/ mount for per-app DB provisioning
- nginx-proxy: stock nginx + certbot sidecar (was nginx:alpine TODO).
Chose stock over nginxproxy/nginx-proxy because stream{} is required for
MQTT-TLS reverse-proxy on tcp/8883 to rabbitmq:1883.
- gitea: HTTPS-only (DISABLE_SSH=true). No SSH port published.
MQTT split
- Remove stacks/mqtt placeholder.
- Add stacks/rabbitmq — general-purpose broker (AMQP + MQTT plugin),
used at both cloud and edge. External MQTT clients reach cloud broker
via nginx stream-proxy on 8883.
- Add stacks/mosquitto — reserved for the FROST (SensorThings) stack
only. Cloud-only. Internal to its own stack; no external ingress.
ML / notebooks (cloud-only)
- stacks/mlflow — experiment tracking + model registry. Postgres backend
on sql stack; local volume for artifacts (S3/MinIO is a TODO).
- stacks/jupyterhub — multi-user notebook server. DockerSpawner via
mounted docker.sock; users spawn into cloud-app network so they can
reach mlflow, influxdb (via grafana), rabbitmq.
Sites
- sites/gemaal1 — first edge deployment scaffold. Site-local override
template for binding nginx to PLANT_LAN_IP.
Docs
- README + docs/architecture.md updated: stacks table now lists 15 stacks,
ingress + attachment tables reflect mlflow/jupyterhub, TLS strategy
section locked, MQTT-split section added, Gitea HTTPS-only noted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
3
stacks/mlflow/.env.example
Normal file
3
stacks/mlflow/.env.example
Normal file
@@ -0,0 +1,3 @@
|
||||
MLFLOW_DB_NAME=mlflow
|
||||
MLFLOW_DB_USER=mlflow
|
||||
MLFLOW_DB_PASSWORD=
|
||||
12
stacks/mlflow/README.md
Normal file
12
stacks/mlflow/README.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# mlflow
|
||||
|
||||
MLflow tracking server + model registry. Used by data scientists running experiments from JupyterHub or local laptops. **Cloud-only.**
|
||||
|
||||
- **Networks**: `app` (UI on port 5000, reverse-proxied at `/mlflow` or subdomain) + `data` (postgres backend in `sql`)
|
||||
- **Backend store**: postgres database `mlflow` — must be provisioned by `sql/config/init.d/`
|
||||
- **Artifact store**: local volume `mlflow-artifacts`. Switch to S3/MinIO when artifact volume grows beyond a few GB.
|
||||
- **TODO**:
|
||||
- Provision `mlflow` DB + role in `sql` init scripts
|
||||
- Keycloak OIDC via nginx `auth_request` (MLflow has no native auth — must front-end it)
|
||||
- MinIO sidecar for S3-compatible artifact store
|
||||
- Retention / cleanup policy for stale runs
|
||||
27
stacks/mlflow/compose.yml
Normal file
27
stacks/mlflow/compose.yml
Normal file
@@ -0,0 +1,27 @@
|
||||
# mlflow — experiment tracking + model registry (cloud only)
|
||||
# Networks: app (UI on 5000, proxied by nginx) + data (postgres backend on sql stack)
|
||||
|
||||
services:
|
||||
mlflow:
|
||||
image: ghcr.io/mlflow/mlflow:v2.18.0
|
||||
restart: unless-stopped
|
||||
networks: [app, data]
|
||||
command: >
|
||||
mlflow server
|
||||
--host 0.0.0.0
|
||||
--port 5000
|
||||
--backend-store-uri postgresql://${MLFLOW_DB_USER}:${MLFLOW_DB_PASSWORD}@sql:5432/${MLFLOW_DB_NAME}
|
||||
--default-artifact-root /mlflow/artifacts
|
||||
--serve-artifacts
|
||||
volumes:
|
||||
- mlflow-artifacts:/mlflow/artifacts
|
||||
environment:
|
||||
TZ: ${TZ:-Europe/Amsterdam}
|
||||
# TODO: switch artifact store to S3/MinIO; Keycloak OIDC via nginx auth_request
|
||||
|
||||
networks:
|
||||
app:
|
||||
data:
|
||||
|
||||
volumes:
|
||||
mlflow-artifacts:
|
||||
Reference in New Issue
Block a user