- local InfluxDB is required for operational resilience
- central acts as the advisory/intelligence and API-entry layer, not as a direct field caller
- intended configuration authority is the database-backed `tagcodering` model
- architecture wiki pages should be visual, not text-only
## 1. What Exists Today
### 1.1 Product/runtime layer
The codebase is currently a modular Node-RED package for wastewater/process automation:
- EVOLV ships custom Node-RED nodes for plant assets and process logic
- nodes emit both process/control messages and telemetry-oriented outputs
- shared helper logic lives in `nodes/generalFunctions/`
- Grafana-facing integration exists through `dashboardAPI` and Influx-oriented outputs
### 1.2 Implemented development stack
The concrete development stack in this repository is:
- Node-RED
- InfluxDB 2.x
- Grafana
That gives a clear local flow:
1. EVOLV logic runs in Node-RED.
2. Telemetry is emitted in a time-series-oriented shape.
3. InfluxDB stores the telemetry.
4. Grafana renders operational dashboards.
### 1.3 Existing runtime pattern in the nodes
A recurring EVOLV pattern is:
- output 0: process/control message
- output 1: Influx/telemetry message
- output 2: registration/control plumbing where relevant
So even in its current implemented form, EVOLV is not only a Node-RED project. It is already a control-plus-observability platform, with Node-RED as orchestration/runtime and InfluxDB/Grafana as telemetry and visualization services.
## 2. What The Drawings Describe
Across `temp/fullStack.pdf` and `temp/CoreSync.drawio.pdf`, the intended platform is broader and layered.
### 2.1 Edge / OT layer
The drawings consistently place these capabilities at the edge:
- PLC / OPC UA connectivity
- Node-RED container as protocol translator and logic runtime
- local broker in some variants
- local InfluxDB / Prometheus style storage in some variants
- local Grafana/SCADA in some variants
This is the plant-side operational layer.
### 2.2 Site / local server layer
The CoreSync drawings also show a site aggregation layer:
- RWZI-local server
- Node-RED / CoreSync services
- site-local broker
- site-local database
- upward API-based synchronization
This layer decouples field assets from central services and absorbs plant-specific complexity.
### 2.3 Central / cloud layer
The broader stack drawings and `temp/cloud.yml` show a central platform layer with:
- Gitea
- Jenkins
- reverse proxy / ingress
- Grafana
- InfluxDB
- Node-RED
- RabbitMQ / messaging
- VPN / tunnel concepts
- Keycloak in the drawing
- Portainer in the drawing
This is a platform-services layer, not just an application runtime.
## 3. Architecture Decisions From This Review
These decisions now shape the preferred EVOLV target architecture.
### 3.1 Local telemetry is mandatory for resilience
Local InfluxDB is not optional. It is required so that:
- operations continue when central SCADA or central services are down
- local dashboards and advanced digital-twin workflows can still consume recent and relevant process history
- local edge/site layers can make smarter decisions without depending on round-trips to central
### 3.2 Multi-level InfluxDB is part of the architecture
InfluxDB should exist on multiple levels where it adds operational value:
- edge/local for resilience and near-real-time replay
- site for plant-level history, diagnostics, and resilience
- central for fleet-wide analytics, benchmarking, and advisory intelligence
This is not just copy-paste storage at each level. The design intent is event-driven and selective.
### 3.3 Storage should be smart, not only deadband-driven
The target is not simple "store every point" or only a fixed deadband rule such as 1%.
The desired storage approach is:
- observe signal slope and change behavior
- preserve points where state is changing materially
- store fewer points where the signal can be reconstructed downstream with sufficient fidelity
- carry enough metadata or conventions so reconstruction quality is auditable
This implies EVOLV should evolve toward smart storage and signal-aware retention rather than naive event dumping.
### 3.4 Central is the intelligence and API-entry layer
Central may advise and coordinate edge/site layers, but external API requests should not hit field-edge systems directly.
The intended pattern is:
- external and enterprise integrations terminate centrally
- central evaluates, aggregates, authorizes, and advises
- site/edge layers receive mediated requests, policies, or setpoints
- field-edge remains protected behind an intermediate layer
This aligns with the stated security direction.
### 3.5 Configuration source of truth should be database-backed
The intended configuration authority is the database-backed `tagcodering` model, which already exists but is not yet complete enough to serve as the fully realized source of truth.
That means the architecture should assume:
- asset and machine metadata belong in `tagcodering`
- Node-RED flows should consume configuration rather than silently becoming the only configuration store
- more work is still needed before this behaves as the intended central configuration backbone
## 4. Visual Model
### 4.1 Platform topology
```mermaid
flowchart LR
subgraph OT["OT / Field"]
PLC["PLC / IO"]
DEV["Sensors / Machines"]
end
subgraph EDGE["Edge Layer"]
ENR["Edge Node-RED"]
EDB["Local InfluxDB"]
EUI["Local Grafana / Local Monitoring"]
EBR["Optional Local Broker"]
end
subgraph SITE["Site Layer"]
SNR["Site Node-RED / CoreSync"]
SDB["Site InfluxDB"]
SUI["Site Grafana / SCADA Support"]
SBR["Site Broker"]
end
subgraph CENTRAL["Central Layer"]
API["API / Integration Gateway"]
INTEL["Overview Intelligence / Advisory Logic"]
CDB["Central InfluxDB"]
CGR["Central Grafana"]
CFG["Tagcodering Config Model"]
GIT["Gitea"]
CI["CI/CD"]
IAM["IAM / Keycloak"]
end
DEV --> PLC
PLC --> ENR
ENR --> EDB
ENR --> EUI
ENR --> EBR
ENR <--> SNR
EDB <--> SDB
SNR --> SDB
SNR --> SUI
SNR --> SBR
SNR <--> API
API --> INTEL
API <--> CFG
SDB <--> CDB
INTEL --> SNR
CGR --> CDB
CI --> GIT
IAM --> API
IAM --> CGR
```
### 4.2 Command and access boundary
```mermaid
flowchart TD
EXT["External APIs / Enterprise Requests"] --> API["Central API Gateway"]
That is substantially more capable than a single central historian model.
### 5.5 `tagcodering` is the right long-term direction
A database-backed configuration authority is stronger than embedding configuration only in flows because it supports:
- machine metadata management
- controlled rollout of configuration changes
- clearer versioning and provenance
- future API-driven configuration services
## 6. Downsides And Risks
### 6.1 Smart storage raises algorithmic and governance complexity
Signal-aware storage and reconstruction is promising, but it creates architectural obligations:
- reconstruction rules must be explicit
- acceptable reconstruction error must be defined per signal type
- operators must know whether they see raw or reconstructed history
- compliance-relevant data may need stricter retention than operational convenience data
Without those rules, smart storage can become opaque and hard to trust.
### 6.2 Multi-level databases can create ownership confusion
If edge, site, and central all store telemetry, you must define:
- which layer is authoritative for which time horizon
- when backfill is allowed
- when data is summarized vs copied
- how duplicates or gaps are detected
Otherwise operations will argue over which trend is "the real one."
### 6.3 Central intelligence must remain advisory-first
Central guidance can become valuable, but direct closed-loop dependency on central would be risky.
The architecture should therefore preserve:
- local control authority at edge/site
- bounded and explicit central advice
- safe behavior if central recommendations stop arriving
### 6.4 `tagcodering` is not yet complete enough to lean on blindly
It is the right target, but its current partial state means there is still architecture debt:
- incomplete config workflows
- likely mismatch between desired and implemented schema behavior
- temporary duplication between flows, node config, and database-held metadata
This should be treated as a core platform workstream, not a side issue.
### 6.5 Broker responsibilities are still not crisp enough
The materials still reference MQTT/AMQP/RabbitMQ/brokers without one stable responsibility split. That needs to be resolved before large-scale deployment.
Questions still open:
- command bus or event bus?
- site-only or cross-site?
- telemetry transport or only synchronization/eventing?
Use this document as the architecture baseline. The companion markdown page in `architecture/` can then be shaped into a wiki-ready visual overview page with Mermaid diagrams and shorter human-readable sections.