Files
EVOLV/.agents/decisions/DECISION-20260422-pumpingstation-simulations-harness.md
znetsixe d22d1cabd1
Some checks failed
CI / lint-and-test (push) Has been cancelled
Rename eval/ decision log to simulations/; bump pumpingStation pointer
Follows pumpingStation@3e13512 (rename eval/ → simulations/). The
decision log file is renamed to match the new folder name; an
addendum in the body explains that the rename was a naming
clarification, not a rationale change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:47:00 +02:00

3.5 KiB

DECISION-20260422-pumpingstation-simulations-harness

Context

  • Task/request: Provide a way to fluctuate inputs to the pumpingStation and observe the system's response over time, in a readable form suitable for post-hoc analysis (operator review, Grafana, or ad-hoc debugging).
  • Impacted files/contracts: nodes/pumpingStation/simulations/*, test/basic/*.
  • Why a decision is required now: Unit tests (node --test) verify individual functions in isolation. They can't ergonomically show "what does the level look like over 20 minutes of storm surge". That's a different artefact.

Options

  1. Extend unit tests to cover scenarios
  • Benefits: Single testing surface.
  • Risks: Unit tests are assertion-heavy and slow to read; scenario output (tables, events) gets lost in TAP.
  1. Separate simulations/ folder with a scenario runner (selected)
  • Benefits: Scenarios read as narratives ("steady state", "storm surge", "safety dry-run"); output is human-friendly (ASCII table + events + expectation checks); JSONL per-tick log enables Grafana streaming or offline analysis.
  • Risks: Second test surface to maintain.
  1. Real-time Node-RED deployment + observe
  • Benefits: Closest to production.
  • Risks: Slow, requires infrastructure, irreproducible.

Decision

  • Selected option: Option 2.
  • Decision owner: User
  • Date: 2026-04-22
  • Rationale: Unit tests answer "is this function correct?"; scenarios answer "how does the system behave under this input profile?". Two distinct questions — two distinct tools. The split also matches the .claude/rules/testing.md 3-tier convention (basic/integration/edge) which is for asserted behaviours, not scenario replay.

Addendum (same-day rename)

Folder was initially named eval/. Renamed to simulations/ in commit pumpingStation@3e13512 — eval and test are near-synonyms so the split implied a conceptual difference that doesn't really exist. simulations/ is more honest about what's happening (scripted plant inputs driving a physics sim, recorded for analysis). Rationale above is unchanged; only the folder name is.

Architecture

test/
  basic/ integration/ edge/   — node:test + assertions
simulations/
  run.js                      — scenario driver
  scenarios/*.js              — each exports { name, config, setup, inputs(t,ps), expectations }
  formatters/table.js         — ASCII summary
  logs/*.jsonl                — one-line-per-tick output
  README.md                   — usage + how to pipe into Grafana

Driver monkey-patches Date.now() so the volume integrator sees 1 second per tick regardless of wall-clock. Every tick records a state snapshot (level, volume, direction, netFlow, flowSource, demand, mode, safetyActive) to JSONL for streaming.

Consequences

  • Compatibility impact: None.
  • Safety/security impact: None — read-only simulation.
  • Data/operations impact: Running node simulations/run.js --all produces artefacts that can be checked into CI for regression (e.g. "did the storm scenario's max level rise compared to last release?"). The JSONL format is friendly to InfluxDB/Grafana for interactive review.

Implementation Notes

  • Required code/doc updates: Driver + three starter scenarios (levelbased-steady, levelbased-storm, safety-dry-run-trip) + README in simulations/.
  • Validation evidence required: node simulations/run.js --all exits 0; manual inspection of JSONL confirms per-tick records make physical sense.

Rollback / Migration

  • Rollback strategy: Delete simulations/. Unit tests continue to work.
  • Migration/deprecation plan: N/A.