Rename eval/ → simulations/ and fix log-write bug

Per discussion: "test" and "eval" overlap in meaning; "simulations" is more honest about what's actually happening — scripted plant inputs driving a physics sim, then recorded for analysis. Rename scope: - eval/ → simulations/ (tracked as git renames) - Internal references in run.js and README.md updated - wiki/modes/mpc.md link updated Also fixes a log-write bug noticed during the rename: - run.js didn't mkdir simulations/logs/ before createWriteStream, so the stream opened into a potentially non-existent dir and the file never materialised. Added fs.mkdirSync(..., recursive:true). - end() wasn't awaited, so the process could exit before the stream flushed. Now awaits the 'finish' event. Confirmed: 1200 records actually land in simulations/logs/<scenario>.jsonl. - Added simulations/logs/.gitignore so future JSONL artefacts stay out of the repo but the dir remains tracked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:46:10 +02:00
parent 66fd3feff8
commit 3e13512a83
8 changed files with 24 additions and 19 deletions
--- a/simulations/README.md
+++ b/simulations/README.md
@@ -6,18 +6,18 @@ Scenario-based evaluation for pumpingStation. Each scenario scripts a stream of

 ```bash
 # One scenario
-node eval/run.js levelbased-steady
+node simulations/run.js levelbased-steady

 # All scenarios at once
-node eval/run.js --all
+node simulations/run.js --all
 ```

-Per-tick records are written to `eval/logs/<scenario>.jsonl` for post-hoc analysis (e.g. streaming into InfluxDB for Grafana, or pandas / jq for one-off exploration).
+Per-tick records are written to `simulations/logs/<scenario>.jsonl` for post-hoc analysis (e.g. streaming into InfluxDB for Grafana, or pandas / jq for one-off exploration).

 ## Scenario file shape

 ```js
-// eval/scenarios/<name>.js
+// simulations/scenarios/<name>.js
 module.exports = {
  name: 'scenario-identifier',
  description: 'one sentence — what the scenario is testing',
@@ -89,22 +89,22 @@ Duration: 1200s, 1s ticks
  ✓ level stays above outflow: min level = 2.00 m (bound: ≥ 0.2)
  ✓ no threshold issues on init: 0 threshold issues at startup (expected 0)

-Log: eval/logs/levelbased-steady.jsonl (1200 records)
+Log: simulations/logs/levelbased-steady.jsonl (1200 records)
 ✅ PASS
 ```

 ## Why separate from `test/`?

-| | `test/` | `eval/` |
+| | `test/` | `simulations/` |
 |---|---|---|
-| runner | `node --test` | `node eval/run.js` |
+| runner | `node --test` | `node simulations/run.js` |
 | scope | one function / small behaviour | end-to-end scenario over time |
 | duration | milliseconds | seconds to minutes (simulated) |
 | assertion style | tight, exact (`assert.equal`) | tolerance / bounds / event counts |
 | output | TAP | summary table + JSONL for analysis |
 | purpose | catch regressions | analyse how the system responds to input |

-Unit tests live under `test/basic/`, `test/integration/`, `test/edge/`. Scenarios live here under `eval/scenarios/`.
+Unit tests live under `test/basic/`, `test/integration/`, `test/edge/`. Scenarios live here under `simulations/scenarios/`.

 ## Sending logs to Grafana (optional)

@@ -116,7 +116,7 @@ jq -c '{
  tags: { scenario: "'$SCENARIO'" },
  fields: { level: .level, volume: .volume, demand: .percControl, safety: (.safetyActive|if . then 1 else 0 end) },
  timestamp: (.t | tonumber | . * 1000000000)
-}' eval/logs/$SCENARIO.jsonl \
+}' simulations/logs/$SCENARIO.jsonl \
  | influx write --bucket=telemetry ...
 ```

--- a/simulations/formatters/table.js
+++ b/simulations/formatters/table.js
@@ -1,5 +1,5 @@
 // ASCII table summary of scenario samples.
-// Used by eval/run.js.
+// Used by simulations/run.js.

 function pad(s, n, left = false) {
  s = String(s ?? '');
--- a/simulations/logs/.gitignore
+++ b/simulations/logs/.gitignore
@@ -0,0 +1,2 @@
+*.jsonl
+!.gitignore
--- a/simulations/run.js
+++ b/simulations/run.js
@@ -1,14 +1,14 @@
 #!/usr/bin/env node
 // Scenario runner for pumpingStation. Usage:
 //
-//   node eval/run.js <scenario>            # run one
-//   node eval/run.js --all                 # run all scenarios
+//   node simulations/run.js <scenario>            # run one
+//   node simulations/run.js --all                 # run all scenarios
 //
-// Each scenario lives in eval/scenarios/<name>.js and exports:
+// Each scenario lives in simulations/scenarios/<name>.js and exports:
 //   { name, description, durationSec, config, setup?, inputs, expectations? }
 //
 // The runner ticks the station once per simulated second, records every
-// state into eval/logs/<name>.jsonl, prints a summary table + event log,
+// state into simulations/logs/<name>.jsonl, prints a summary table + event log,
 // and checks expectations.

 const path = require('path');
@@ -102,7 +102,9 @@ async function runScenario(name) {
    if (scenario.setup) await scenario.setup(ps);

    const duration = scenario.durationSec ?? 600;
-    const logPath = path.join(__dirname, 'logs', `${scenario.name}.jsonl`);
+    const logDir = path.join(__dirname, 'logs');
+    fs.mkdirSync(logDir, { recursive: true });
+    const logPath = path.join(logDir, `${scenario.name}.jsonl`);
    const log = fs.createWriteStream(logPath);

    const records = [];
@@ -115,7 +117,8 @@ async function runScenario(name) {
      records.push(snap);
      log.write(JSON.stringify(snap) + '\n');
    }
-    log.end();
+    // Drain so the file is fully written before we return.
+    await new Promise((resolve, reject) => { log.end(); log.on('finish', resolve); log.on('error', reject); });

    return { ps, records, scenario, duration, logPath };
  } finally {
@@ -174,7 +177,7 @@ async function runAndReport(name) {
 async function main() {
  const arg = process.argv[2];
  if (!arg) {
-    console.error('Usage: node eval/run.js <scenario> | --all');
+    console.error('Usage: node simulations/run.js <scenario> | --all');
    console.error('Available:', fs.readdirSync(path.join(__dirname, 'scenarios')).map((f) => f.replace(/\.js$/, '')).join(', '));
    process.exit(1);
  }
--- a/simulations/scenarios/levelbased-steady.js
+++ b/simulations/scenarios/levelbased-steady.js
--- a/simulations/scenarios/levelbased-storm.js
+++ b/simulations/scenarios/levelbased-storm.js
--- a/simulations/scenarios/safety-dry-run-trip.js
+++ b/simulations/scenarios/safety-dry-run-trip.js
--- a/wiki/modes/mpc.md
+++ b/wiki/modes/mpc.md
@@ -78,7 +78,7 @@ Blocks:

 ## Diagram 2 — scenario time-series

-A much more useful way to evaluate MPC is to plot *what it did* over a simulated scenario: level, planned vs actual demand, the cost function breakdown, the active constraints. The [eval harness](../../eval/README.md) is built for exactly this — MPC will need a dedicated scenario like `mpc-storm-with-forecast.js`.
+A much more useful way to evaluate MPC is to plot *what it did* over a simulated scenario: level, planned vs actual demand, the cost function breakdown, the active constraints. The [simulations harness](../../simulations/README.md) is built for exactly this — MPC will need a dedicated scenario like `mpc-storm-with-forecast.js`.

 ```
 Placeholder — replace with:
@@ -146,4 +146,4 @@ demand = plan.command[0]
 - [Functional description](../functional-description.md) — basin model + safety layer
 - [modes/levelbased.md](levelbased.md) — Tier 1 — the "default" MPC falls back to
 - [modes/powerbased.md](powerbased.md) — Tier 2 — MPC generalises the clip idea into full optimisation
- [eval/README.md](../../eval/README.md) — where MPC evaluation scenarios will live
+- [simulations/README.md](../../simulations/README.md) — where MPC simulation scenarios will live