Rename eval/ → simulations/ and fix log-write bug
Per discussion: "test" and "eval" overlap in meaning; "simulations" is more honest about what's actually happening — scripted plant inputs driving a physics sim, then recorded for analysis. Rename scope: - eval/ → simulations/ (tracked as git renames) - Internal references in run.js and README.md updated - wiki/modes/mpc.md link updated Also fixes a log-write bug noticed during the rename: - run.js didn't mkdir simulations/logs/ before createWriteStream, so the stream opened into a potentially non-existent dir and the file never materialised. Added fs.mkdirSync(..., recursive:true). - end() wasn't awaited, so the process could exit before the stream flushed. Now awaits the 'finish' event. Confirmed: 1200 records actually land in simulations/logs/<scenario>.jsonl. - Added simulations/logs/.gitignore so future JSONL artefacts stay out of the repo but the dir remains tracked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -6,18 +6,18 @@ Scenario-based evaluation for pumpingStation. Each scenario scripts a stream of
|
||||
|
||||
```bash
|
||||
# One scenario
|
||||
node eval/run.js levelbased-steady
|
||||
node simulations/run.js levelbased-steady
|
||||
|
||||
# All scenarios at once
|
||||
node eval/run.js --all
|
||||
node simulations/run.js --all
|
||||
```
|
||||
|
||||
Per-tick records are written to `eval/logs/<scenario>.jsonl` for post-hoc analysis (e.g. streaming into InfluxDB for Grafana, or pandas / jq for one-off exploration).
|
||||
Per-tick records are written to `simulations/logs/<scenario>.jsonl` for post-hoc analysis (e.g. streaming into InfluxDB for Grafana, or pandas / jq for one-off exploration).
|
||||
|
||||
## Scenario file shape
|
||||
|
||||
```js
|
||||
// eval/scenarios/<name>.js
|
||||
// simulations/scenarios/<name>.js
|
||||
module.exports = {
|
||||
name: 'scenario-identifier',
|
||||
description: 'one sentence — what the scenario is testing',
|
||||
@@ -89,22 +89,22 @@ Duration: 1200s, 1s ticks
|
||||
✓ level stays above outflow: min level = 2.00 m (bound: ≥ 0.2)
|
||||
✓ no threshold issues on init: 0 threshold issues at startup (expected 0)
|
||||
|
||||
Log: eval/logs/levelbased-steady.jsonl (1200 records)
|
||||
Log: simulations/logs/levelbased-steady.jsonl (1200 records)
|
||||
✅ PASS
|
||||
```
|
||||
|
||||
## Why separate from `test/`?
|
||||
|
||||
| | `test/` | `eval/` |
|
||||
| | `test/` | `simulations/` |
|
||||
|---|---|---|
|
||||
| runner | `node --test` | `node eval/run.js` |
|
||||
| runner | `node --test` | `node simulations/run.js` |
|
||||
| scope | one function / small behaviour | end-to-end scenario over time |
|
||||
| duration | milliseconds | seconds to minutes (simulated) |
|
||||
| assertion style | tight, exact (`assert.equal`) | tolerance / bounds / event counts |
|
||||
| output | TAP | summary table + JSONL for analysis |
|
||||
| purpose | catch regressions | analyse how the system responds to input |
|
||||
|
||||
Unit tests live under `test/basic/`, `test/integration/`, `test/edge/`. Scenarios live here under `eval/scenarios/`.
|
||||
Unit tests live under `test/basic/`, `test/integration/`, `test/edge/`. Scenarios live here under `simulations/scenarios/`.
|
||||
|
||||
## Sending logs to Grafana (optional)
|
||||
|
||||
@@ -116,7 +116,7 @@ jq -c '{
|
||||
tags: { scenario: "'$SCENARIO'" },
|
||||
fields: { level: .level, volume: .volume, demand: .percControl, safety: (.safetyActive|if . then 1 else 0 end) },
|
||||
timestamp: (.t | tonumber | . * 1000000000)
|
||||
}' eval/logs/$SCENARIO.jsonl \
|
||||
}' simulations/logs/$SCENARIO.jsonl \
|
||||
| influx write --bucket=telemetry ...
|
||||
```
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
// ASCII table summary of scenario samples.
|
||||
// Used by eval/run.js.
|
||||
// Used by simulations/run.js.
|
||||
|
||||
function pad(s, n, left = false) {
|
||||
s = String(s ?? '');
|
||||
2
simulations/logs/.gitignore
vendored
Normal file
2
simulations/logs/.gitignore
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
*.jsonl
|
||||
!.gitignore
|
||||
@@ -1,14 +1,14 @@
|
||||
#!/usr/bin/env node
|
||||
// Scenario runner for pumpingStation. Usage:
|
||||
//
|
||||
// node eval/run.js <scenario> # run one
|
||||
// node eval/run.js --all # run all scenarios
|
||||
// node simulations/run.js <scenario> # run one
|
||||
// node simulations/run.js --all # run all scenarios
|
||||
//
|
||||
// Each scenario lives in eval/scenarios/<name>.js and exports:
|
||||
// Each scenario lives in simulations/scenarios/<name>.js and exports:
|
||||
// { name, description, durationSec, config, setup?, inputs, expectations? }
|
||||
//
|
||||
// The runner ticks the station once per simulated second, records every
|
||||
// state into eval/logs/<name>.jsonl, prints a summary table + event log,
|
||||
// state into simulations/logs/<name>.jsonl, prints a summary table + event log,
|
||||
// and checks expectations.
|
||||
|
||||
const path = require('path');
|
||||
@@ -102,7 +102,9 @@ async function runScenario(name) {
|
||||
if (scenario.setup) await scenario.setup(ps);
|
||||
|
||||
const duration = scenario.durationSec ?? 600;
|
||||
const logPath = path.join(__dirname, 'logs', `${scenario.name}.jsonl`);
|
||||
const logDir = path.join(__dirname, 'logs');
|
||||
fs.mkdirSync(logDir, { recursive: true });
|
||||
const logPath = path.join(logDir, `${scenario.name}.jsonl`);
|
||||
const log = fs.createWriteStream(logPath);
|
||||
|
||||
const records = [];
|
||||
@@ -115,7 +117,8 @@ async function runScenario(name) {
|
||||
records.push(snap);
|
||||
log.write(JSON.stringify(snap) + '\n');
|
||||
}
|
||||
log.end();
|
||||
// Drain so the file is fully written before we return.
|
||||
await new Promise((resolve, reject) => { log.end(); log.on('finish', resolve); log.on('error', reject); });
|
||||
|
||||
return { ps, records, scenario, duration, logPath };
|
||||
} finally {
|
||||
@@ -174,7 +177,7 @@ async function runAndReport(name) {
|
||||
async function main() {
|
||||
const arg = process.argv[2];
|
||||
if (!arg) {
|
||||
console.error('Usage: node eval/run.js <scenario> | --all');
|
||||
console.error('Usage: node simulations/run.js <scenario> | --all');
|
||||
console.error('Available:', fs.readdirSync(path.join(__dirname, 'scenarios')).map((f) => f.replace(/\.js$/, '')).join(', '));
|
||||
process.exit(1);
|
||||
}
|
||||
@@ -78,7 +78,7 @@ Blocks:
|
||||
|
||||
## Diagram 2 — scenario time-series
|
||||
|
||||
A much more useful way to evaluate MPC is to plot *what it did* over a simulated scenario: level, planned vs actual demand, the cost function breakdown, the active constraints. The [eval harness](../../eval/README.md) is built for exactly this — MPC will need a dedicated scenario like `mpc-storm-with-forecast.js`.
|
||||
A much more useful way to evaluate MPC is to plot *what it did* over a simulated scenario: level, planned vs actual demand, the cost function breakdown, the active constraints. The [simulations harness](../../simulations/README.md) is built for exactly this — MPC will need a dedicated scenario like `mpc-storm-with-forecast.js`.
|
||||
|
||||
```
|
||||
Placeholder — replace with:
|
||||
@@ -146,4 +146,4 @@ demand = plan.command[0]
|
||||
- [Functional description](../functional-description.md) — basin model + safety layer
|
||||
- [modes/levelbased.md](levelbased.md) — Tier 1 — the "default" MPC falls back to
|
||||
- [modes/powerbased.md](powerbased.md) — Tier 2 — MPC generalises the clip idea into full optimisation
|
||||
- [eval/README.md](../../eval/README.md) — where MPC evaluation scenarios will live
|
||||
- [simulations/README.md](../../simulations/README.md) — where MPC simulation scenarios will live
|
||||
|
||||
Reference in New Issue
Block a user