The appendix aggregates all design decisions of the Site Protection and Disaster Recovery for VMware Cloud Foundation validated solution. You can use this design decision list for reference related to the end state of the environment and potentially to track your level of adherence to the design and any justification for deviations.
Deploy Specification
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-CFG-001 |
Deploy Site Recovery Manager as a virtual appliance. |
Allows you to orchestrate the recovery of the VMware Cloud Foundation management components in another VMware Cloud Foundation instance. |
None. |
SPR-SRM-CFG-002 |
Deploy each Site Recovery Manager instance in the management domain. |
Provides a consistent deployment model for all management applications. |
None. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-CFG-003 |
Deploy the Site Recovery Manager virtual appliance using the Light deployment type. |
Provides highest level of availability by protecting the management components. This size further accommodates the following setup:
|
None |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-CFG-004 |
Use vSphere Replication in Site Recovery Manager as the protection method for virtual machine replication. |
|
|
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-VR-CFG-001 |
Deploy each vSphere Replication appliance in the vCenter Server it will be registered with. |
vSphere Replication must be deployed in the vCenter Server it is registered with as it discovers the certificate thumbprint during the OVF deployment via the OVF environment. |
None |
SPR-VR-CFG-002 |
Deploy each vSphere Replication appliance using the 4 vCPU size. |
Accommodates the replication of the expected number of virtual machines that are a part of the following components:
|
None. |
Network Design
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-NET-001 |
Place the Site Recovery Manager instances on the management network. |
Places the Site Recovery Manager on the same network as the VMware Cloud Foundation components that the appliance must communicate with. |
None. |
SPR-VR-NET-001 |
Place the vSphere Replication instances on the management network. |
Places the vSphere Replication on the same network as the VMware Cloud Foundation components that the appliance must communicate with. |
None. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-NET-002 |
Allocate and assign a static IP address to the Site Recovery Manager instances. |
Using assigned IP addresses removes the constraints and risks associated with providing and managing DHCP on your management networks. |
The use of static IP addresses requires precise IP address management. |
SPR-VR-NET-002 |
Allocate and assign a static IP address to the vSphere Replication instances. |
Using assigned IP addresses removes the constraints and risks associated with providing and managing DHCP on your management networks. |
The use of static IP addresses requires precise IP address management. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-NET-003 |
Configure both forward (A) and reverse (PTR) DNS records for each Site Recovery Manager instance. |
Site Recovery Manager is accessible using a fully qualified domain name. |
|
SPR-VR-NET-003 |
Configure both forward (A) and reverse (PTR) DNS records for each vSphere Replication instance. |
vSphere Replication is accessible using a fully qualified domain name. |
|
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-NET-004 |
Configure the Site Recovery Manager instances to use NTP servers rather than using VMTools to synchronize with the ESXi hosts on which it is running. |
|
|
SPR-VR-NET-004 |
Configure the vSphere Replication instances to use NTP servers rather than using VMTools to synchronize with the ESXi hosts on which it is running. |
|
|
Life Cycle Management Design
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-LCM-001 |
Life cycle management of Site recovery Manager is provided using the native tools in the appliance. |
Site Recovery Manager is not managed by SDDC Manager. |
Deployment, patching, updates, and upgrades of Site Recovery Manager are performed without native automation. |
SPR-VR-LCM-001 |
Life cycle management of vSphere Replication is provided using the native tools in the appliance. |
vSphere Replication is not managed by SDDC Manager. |
Deployment, patching, updates, and upgrades of vSphere Replication are performed without native automation. |
Information Security and Access Design
Decision ID |
Design Decision |
Justification |
Implication |
---|---|---|---|
SPR-SRM-SEC-001 |
Configure a service account in vCenter Server for application-to-application communication from Site Recovery Manager to vSphere. This user account must be a member of the vCenter Single Sign-On administrator group. |
Provides the following access control features:
|
You must maintain the service account's life cycle outside of VMware Cloud Foundation to ensure its availability. |
SPR-VR-SEC-001 |
Configure a service account in vCenter Server for application-to-application communication from vSphere Replication to vSphere. This user account must be a member of the vCenter Single Sign-On administrator group. |
Provides the following access control features:
|
You must maintain the service account's life cycle outside of VMware Cloud Foundation to ensure its availability. |
SPR-VC-SEC-001 |
Use global permissions when you create the Site Recovery Manager and vSphere Replication service accounts in vCenter Server. |
Simplifies and standardizes the deployment of the service account across all vCenter Server instances in the same vSphere domain.
|
All vCenter Server instances must be in the same vSphere domain. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-SEC-002 |
Replace the default self-signed certificate in each Site Recovery Manager instance with a CA-signed certificate. |
Ensures that all communication to the externally facing Web UI of Site Recovery Manager and cross-product communication are encrypted. |
You must have access to a Public Key Infrastructure (PKI) to acquire certificates. |
SPR-VR-SEC-002 |
Replace the default self-signed certificate in each vSphere Replication instance with a CA-signed certificate. |
Ensures that all communication to the externally facing Web UI for vSphere Replication and cross-product communication are encrypted. |
You must have access to a Public Key Infrastructure (PKI) to acquire certificates. |
Recovery Plan Design
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-CFG-005 |
Use Site Recovery Manager and vSphere Replication together to automate the recovery of the following management components:
|
|
None. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-VR-CFG-003 |
Do not activate guest OS quiescing in the policies for the management virtual machines in vSphere Replication. |
Not all management virtual machines support the use of guest OS quiescing. Using the quiescing operation might result in an outage. |
The replicas of the management virtual machines that are stored in the target VMware Cloud Foundation instance are crash-consistent rather than application-consistent. |
SPR-VR-CFG-004 |
Activate network compression on the management virtual machine policies in vSphere Replication. |
|
To perform compression and decompression of data, vSphere Replication VM might require more CPU resources on the source site as more virtual machines are protected. |
SPR-VR-CFG-005 |
Configure a recovery point objective (RPO) of 15 minutes on the management virtual machine policies in vSphere Replication. |
|
Any changes that are made up to 15 minutes before a disaster recovery event are lost. |
SPR-VR-CFG-006 |
Configure point-in-time (PIT) instances, keeping 3 copies over a 24-hour period on the management virtual machine policies in vSphere Replication. |
Ensures application integrity for the management application that is failing over after a disaster recovery event occurs. |
Increasing the number of retained recovery point instances increases the disk usage on the vSAN datastore. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-RP-001 |
Use a prioritized startup order for vRealize Suite Lifecycle Manager and the clustered Workspace ONE Access nodes. |
|
You must have VMware Tools running on vRealize Suite Lifecycle Manager and each of the clustered Workspace ONE Access nodes. |
SPR-SRM-RP-002 |
Use a prioritized startup order for vRealize Operations Manager analytics cluster nodes. |
Ensures that the individual nodes in the vRealize Operations Manager analytics cluster are started in such an order that the operational monitoring services are restored after a disaster. |
|
SPR-SRM-RP-003 |
Use a prioritized startup order for vRealize Operations Manager remote collector nodes. |
|
|
SPR-SRM-RP-004 |
Use a prioritized startup order for vRealize Automation nodes. |
|
You must have VMware Tools installed and running on each vRealize Automation node. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-RP-005 |
Do not run test recovery of recovery plans. |
Because the protected applications use an NSX load balancer, it is not possible to bring the applications online in an isolated test network. DNS resolution is also unavailable in an isolated test network. |
You cannot test disaster recovery without impacting the running production applications. |
Failover Design for SDDC Management Components
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-DNS-NET-001 |
In an environment with multiple VMware Cloud Foundation instances, configure the DNS settings for each protected component to use the DNS servers across all VMware Cloud Foundation instances. |
Each protected component can resolve DNS from DNS servers during a planned migration or disaster recovery between VMware Cloud Foundation instances. |
As you scale from a single VMware Cloud Foundation instance to multiple VMware Cloud Foundation instances, you must update the DNS settings on each protected component. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-NTP-NET-001 |
In an environment with multiple VMware Cloud Foundation instances, configure the NTP settings for each protected component to use the NTP servers across all VMware Cloud Foundation instances. |
Each protected component can resolve NTP from NTP servers during a planned migration or disaster recovery between VMware Cloud Foundation instances. |
As you scale from a single VMware Cloud Foundation instance to multiple VMware Cloud Foundation instances, you must update the NTP settings on each protected component. |
Solution Interoperability
Design Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-VROPS-CFG-001 |
Install the Site Recovery Manager management pack for vRealize Operations Manager. |
Establishes the communication between vRealize Operations Manager and VMware Site Recovery Manager endpoints. |
You must install the management pack manually. |
SPR-VROPS-CFG-002 |
Install the vSphere Replication management pack for vRealize Operations Manager. |
Establishes the communication between vRealize Operations Manager and VMware vSphere Replication endpoints. |
You must install the management pack manually. |
SPR-VROPS-CFG-003 |
Configure the following endpoints to use the remote collector group:
|
Local-instance components are configured to use the remote collector group. This offloads data collection for local management components from the analytics cluster. |
None. |
Design Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-SEC-003 |
Configure a service account in vCenter Server with global permissions, for application-to-application communication from the Site Recovery Manager adapters in vRealize Operations Manager to vSphere and Site Recovery Manager, and assign the Read Only role. |
Provides the following access control features:
|
You must maintain the life cycle and availability of the service account outside of the SDDC stack. |
SPR-VR-SEC-003 |
Configure a service account in vCenter Server with global permissions, for application-to-application communication from the vSphere Replication adapters in vRealize Operations Manager to vSphere and vSphere Replication, and assign the VRM replication viewer role. |
Provides the following access control features:
|
You must maintain the life cycle and availability of the service account outside of the SDDC stack. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-SRM-LOG-001 |
When using Site Recovery Manager, install and configure the vRealize Log Insight agent on the Site Recovery Manager appliance. |
Simplifies configuration of log sources in the SDDC that are packaged with the vRealize Log Insight agent. |
You must configure the vRealize Log Insight agent to forward logs to the vRealize Log Insight VIP. |
SPR-SRM-LOG-002 |
Configure the vRealize Log Insight agent to transmit logs from the Site Recovery Manager instance to the adjacent vRealize Log Insight in the VMware Cloud Foundation instance using the vRealize Log Insight ingestion API, |
Ensures the transmission of logs from the Site Recovery Manager instance to the adjacent vRealize Log Insight by using the Ingestion API. |
This configuration is unencrypted. To ensure that the transmission of logs from the Site Recovery Manager instance is encrypted using TSL, you must update the configuration on the Site Recovery Manager instance to send logs to vRealize Log Insight by using the ingestion API, |
SPR-SRM-LOG-003 |
When using vSphere Replication, install and configure the vRealize Log Insight agent on the vSphere Replication appliance. |
The vRealize Log Insight agent is required to collect and transfer logs to the vRealize Log Insight instances. |
You must configure the vRealize Log Insight agent to forward logs to the vRealize Log Insight VIP. |
SPR-SRM-LOG-004 |
Configure the vRealize Log Insight agent to transmit logs from the vSphere Replication instance to the adjacent vRealize Log Insight in the VMware Cloud Foundation instance using the vRealize Log Insight ingestion API, |
Ensures the transmission of logs from the vSphere Replication instance to the adjacent vRealize Log Insight by using the Ingestion API. |
This configuration is unencrypted. To ensure that the transmission of logs from the vSphere Replication instance is encrypted using TSL, you must update the configuration on the vSphere Replication instance to send logs to vRealize Log Insight by using the ingestion API, |
SPR-SRM-LOG-005 |
Configure a dedicated Photon OS agent group and assign the Site Recovery Manager and vSphere Replication FQDNs. |
|
Adds minimal load to vRealize Log Insight. |
Decision ID |
Design Decision |
Design Justification |
Design Implication |
---|---|---|---|
SPR-VRLI-NET-001 |
For all applications that are capable of failing over between VMware Cloud Foundation instances, such as vRealize Automation and vRealize Operations Manager, when you configure logging, use the FQDN of the vRealize Log Insight ILB in the protected instance. |
Logging continues during a partial failover to a recovery VMware Cloud Foundation instance. |
|