The Utility Substation vPAC Ready Infrastructure validated solution guide provides design, implementation, and operational guidance for a workload domain that runs vSphere, and is configured to ensure maximum performance, when working with real-time systems in a power substation.
The technical implementation is constructed and tested by VMware and its partners to help power service providers resolve their common business use cases. VMware validated solutions are operational, performant, reliable, and secure. Each solution contains detailed design implementation, operational guidance, and interoperability.
Support Matrix
vPAC Ready Infrastructure is compatible with certain versions of the VMware products that are used for implementing the solution.
Product Name |
Product Version |
Release Notes |
---|---|---|
VMware ESXi |
8.0 Update 2 |
|
VMware vCenter Server |
8.0 Update 2 |
Additional supporting products, such as VMware vSAN, VMware Tanzu, and VeloCloud SD-WAN, are discussed in this guide and its appendices, in terms of how they can be activated and integrated to enhance the base components listed in Software Components found in vPAC Ready Infrastructure.
Intended Audience
This guide is intended for both informational and operational technology systems architects and administrators who are already generally familiar with, and use, VMware software to deploy and manage software-defined substation application architecture running on virtual workloads, within a power facility or substation.
This guide provides guidance for capacity, scalability, backup and restoration, and extensibility for disaster recovery support. It is also assumed that the power system protection, automation, control, and telecommunications professionals who are involved in the implementation of vPAC Ready Infrastructure are already generally familiar with networking and IEC 61850/61869 standards.
Related to the virtualization environment, a training section is included in Appendix C: OT Personnel Training that recommends several levels of educational materials for users just beginning with the technology.
vPAC Ready Infrastructure Overview
Virtual Protection, Automation, and Control (vPAC) is architected to improve upon traditional grid devices such as microprocessor, solid state, and electromechanical appliances, which feature mainly fixed functionality. As this new technology becomes prominent, the key benefits are expected to be:
Flexibility in grid operations, to include integration of high penetrations of distributed generation.
Significant increases in data collection and analysis.
Simplified asset management.
Reduction in the quantity of devices to own and maintain.
Safer physical working environments in field locations.
Decreased labor costs in both capital and operations expenditures.
Improvements in standardization and interoperability.
There are three main components considered in the composition of this infrastructure:
Rugged, high-powered computing hardware.
Software-defined protection, automation, and control applications.
Virtualization environment with real-time capabilities.
This guide focuses primarily on virtualization environment with real-time capabilities, providing guidance to attain the highest levels of persistent performance and availability, coupled with the lowest levels of latency achievable.
Glossary of Terms
The following terminology and product names or features are used throughout this guide.
Terminology |
Definition |
---|---|
Aria Operations for Applications |
Provides a centralized management platform for consistently operating and securing Kubernetes infrastructure and modern applications across multiple teams and clouds. |
Container |
A container encapsulates an application in a form that is portable and easy to deploy. Containers can run without changes on the VMware platform with VMware Tanzu Kubernetes Grid (TKG). They consume resources efficiently, enabling high density within a virtual environment. Although containers can be used with almost any application, they are frequently associated with microservices, in which multiple containers run separate application components or services. The containers that make up microservices are typically coordinated and managed using a container orchestration platform, such as Kubernetes. |
Edge Compute Stack |
Edge Compute Stack (ECS) is a portfolio of VMware products tailored to build, run, manage, connect, and protect edge-native applications at the near edge (larger, primary sites) and the far edge (smaller, secondary sites). |
ESXi |
A bare-metal hypervisor that installs directly onto a physical server. With direct access to, and control of, underlying resources, VMware ESXi effectively partitions hardware to consolidate applications and cut costs. It is the industry leader for efficient architecture, setting the standard for reliability, performance, and support. |
Harbor Image Registry |
Provides a centralized location to push, pull, store, and scan container images used in Kubernetes workloads. It supports storing artifacts such as Helm Charts and includes enterprise grade features such as Role-Based Access Control (RBAC), retention policies, automated garbage clean up, and docker hub proxying.
|
Tanzu Kubernetes Grid (TKG) |
Enables the creation and lifecycle management of Kubernetes clusters. TKG is a set of nodes running containerized applications. |
Tanzu Mission Control (TMC) |
Provides a global view of Kubernetes clusters and allows for centralized policy management across all deployed and attached clusters. |
Tanzu Service Mesh |
Provides consistent control and security for microservices, end users, and data, across all clusters and clouds. |
Virtual Machine (VM) |
A VM is a compute resource that uses software instead of a physical computer to run programs and deploy applications. One or more virtual machines can run on a physical (VMware ESXi) server or cluster of servers. Each virtual machine runs its own operating system and functions separately from other VMs, even when running on the same physical host. |
vCenter |
An advanced server management software that provides a centralized platform for controlling vSphere environments for visibility across hybrid clouds (across from data center to edge). |
vSAN |
Shared storage for VMs, works in conjunction with vSphere High Availability (HA) and Dynamic Resource Scheduler (DRS). |
vSphere |
VMware’s virtualization platform, aggregating compute infrastructure (CPU, storage, and networking) resources and managing within a unified operating environment. vSphere encompasses several distinct products and technologies that work together to provide a complete infrastructure for virtualization. |
Terminology |
Definition |
---|---|
Generic Object Oriented Substation Event (GOOSE) |
GOOSE is a controlled model mechanism in which any format of data (status, value) is grouped into a data set and transmitted within a period of four milliseconds. GOOSE is a communications protocol defined by the IEC 61850 standard, which was originally intended for LAN-restricted traffic in layer 2. A routable version of the protocol (known as R-GOOSE) is defined within IEC 61850-90-5. |
High-availability Seamless Redundancy (HSR) |
HSR protocol is a communications protocol that achieves 0 ms recovery time for network device failures. Each participating device is attached together in a ring topology. Devices not participating in HSR must not be connected to the same network. If the virtualized environment does not participate in the HSR protocol, it requires a special Dual Attached Node (DAN) Network Interface Card (NIC) with two ports specific to the HSR ring traffic. This functionality is commonly referred to as a RedBox or redundancy box feature. The external-facing NIC ports are connected to separate Ethernet managed switches and generate TCP/IP traffic across both NIC ports simultaneously, one in each direction on the ring. Similarly, when two HSR networks are connected, a QuadBox function is required, which is usually applied redundantly to prevent any single points of failure. HSR is defined in IEC 62439-3 Clause 5. |
Intelligent Electronic Device (IED) |
IED is how traditional microprocessor protection, automation, and control devices are referred to, having integrated, multi-function capabilities. |
Merging Unit (MU) |
MU or Process Interface Unit (PIU), is used to convert analog (typically currents and voltages) signals from the instrument transformers, merges and sends them to the protective devices in a standard-based digital output format. |
Manufacturing Message Specification (MMS) |
MMS is a client-server protocol used for information exchange between protection, automation, and control devices or applications and higher-level systems (for example, Supervisory Control And Data Acquisition or SCADA) over the Ethernet. The MMS protocol is mapped on TCP/IP and enables TCP/IP communications between networked devices to read or write data, read configurations, and exchange files. MMS resides on the station bus and is an ISO 9506 and IEC 61850-8-1 standard. |
Protection, Automation, and Control (PAC) |
PAC refers to orchestrated, intelligent, logic systems within a power grid. These systems might be made up of analog, electromechanical, solid-state, or microprocessor devices, or virtual applications. Protection typically refers to dedicated devices (often referred to as relays) or applications used to provide selective high-speed isolation of a power system fault from all sources of generation. These devices operate high voltage apparatus to segment the grid from an undesirable condition that is detected internally within its designated zone of protection. Original protection algorithm implementations are described as ANSI number functions (for example, 50, 51, and so on). Special requirements are high levels of determinism, real-time or low latency networking, high availability, and redundancy. vPR is then a software-defined version of a protection relay, provided as an application that operates as part of a virtual machine or within a container-based format. Automation typically refers to devices or applications used to automate power system functionality using a collection of components that monitors and controls high voltage apparatus. For example, Fault Location, Isolation, and Service Restoration (FLISR). Control typically refers to devices or applications used to provide local and remote operability and collect and logically provide indication and annunciation through monitoring power system assets. For example, a Remote Telemetry Unit (RTU) or a Human Machine Interface (HMI). vAC is then a software-defined version of either automation or control applications, which is intended to include any non-protection function used to operate the power grid. |
Parallel Redundancy Protocol (PRP) |
PRP is a communications protocol that achieves 0 ms recovery time for network device failures. Each participating device is attached to two separate parallel networks as a Dual Attached Node (DAN). Devices can be attached to either network with a single connection as a Singly Attached Node (SAN), but they do not benefit from the PRP redundancy. If the virtualized environment does not participate in the PRP protocol, it requires a special DAN NIC with two ports specific to the PRP traffic. This functionality is commonly referred to as a RedBox or redundancy box feature. The external-facing NIC ports are connected to separate Ethernet managed switches and generate TCP/IP traffic across both NIC ports simultaneously. PRP is defined in IEC 62439-3, Clause 4. |
Precision Time Protocol (PTP) |
PTP provides a method to precisely coordinate timestamps throughout a network. Time synchronization is achieved through packets that are transmitted and received in a session between the GNSS synchronized originating signal at the grandmaster clock and all the subsequent participating devices (ordinary, transparent, and boundary clocks). PTP networks can achieve nanosecond-level synchronization compared to Network Time Protocol (NTP) which can only achieve millisecond-level synchronization. PTP is part of the IEEE-1588 standard. The Power Profile or the IEC 61850-9-3 and IEEE C37.238 applications of the standard are typically used in the power industry, due to the hard-coded requirements used to meet the highest synchronization needs found in PAC systems. It is common to find PTP messages on the process bus, but may also be orchestrated within the station bus, or both.
|
Process Bus |
Process bus typically refers to the digital transmission of analog measurements or binary signals over the Ethernet between the power station apparatus and low-level sensors, and the bay-level protection, automation, and control devices or applications. Process bus is often restricted to Layer 2 network protocols (SV, GOOSE, or PTP). |
Sampled Values |
Sampled Values (SV) or Sampled Message Values (SMV) are current and voltage signals from instrument transformers that are digitized and then communicated using an Ethernet-based Local Area Network (LAN). Sampled Values are transmitted as high-speed streams of data set samples encoded in multicast Ethernet frames. The protocol uses a publisher or subscriber model, in which a publisher transmits unacknowledged data to subscribers. SV/SMV is a layer-2 protocol and typically resides on the process bus. SV/SMV is defined in the IEC 61850-9-2 standard. Typical standards used for publishing SVs include:
Bandwidth usage is high at approximately 5.3 Mbps for 4.8 kHz and 13.5 Mbps for 14.4 kHz sample rates. PRP, vLANs, and QoS are used to help ensure reliable transmission of the packets. Much like GOOSE, SVs are another communications protocol defined by the IEC 61850 standard, originally intended for LAN-restricted traffic in Layer 2. A routable version of the protocol (known as R-SV) has been defined within IEC 61850-90-5. |
Station Bus |
Station bus typically refers to the digital transmission of analog or binary data over the Ethernet between the bay-level protection, automation, and control devices or applications and the power station-level supervisory or management systems and applications. Station bus often includes up to Layer 3 network protocols (including MMS). |
Acronyms and Definitions
This section lists the acronyms used frequently in this reference architecture guide.
Acronym |
Definition |
---|---|
ECS |
Edge Compute Stack |
LCM |
Lifecycle Management |
TKG |
Tanzu Kubernetes Grid |
TMC |
Tanzu Mission Control |
SDDC |
Software Defined Data Center |
VVS |
VMware Validated Solution |
vPR |
Virtual Protection and Relay |
vAC |
Virtual Automation Control |
GOOSE |
Generic Object Oriented Substation Event |
HSR |
High-availability Seamless Redundancy |
HMI |
Human Machine Interface |
MU |
Merging Unit |
PIU |
Process Interface Unit |
MMS |
Manufacturing Message Specification |
PAC |
Protection, Automation, and Control |
PRP |
Parallel Redundancy Protocol |
PTP |
Precision Time Protocol |
CSP |
Common Substation Platform |
PCR |
Platform Configuration Registers |
AK |
Attestation Key |
TPM |
Trusted Platform Module |
CRB |
Command Response Buffer |
BES |
Bulk Electric System |
NIST |
National Institute of Standards and Technology |
KEK |
Key Encryption Key |
DEK |
Data Encryption Key |
BES |
Bulk Electric System |
EACMS |
Electronic Access Controller Monitoring Systems |
PACS |
Physical Access Control System |
SCI |
Shared Cyber Infrastructure |
VCA |
Virtual Cyber Asset |
SIEM |
Security Information and Event Management |
ATP |
Advanced Threat Prevention |