Do all platform VMs have to be on the same L2/L3 segment?

No. However, it is best to keep all platform nodes on a common network with low latencies between nodes. This is because many of the distributed components replicate data among the nodes and high latencies can cause system performance and stability issues.

Can a cluster be upgraded using in-product upgrade feature?

Online upgrades are not supported for cluster till 3.7. From 3.8 and the succeeding releases, a cluster can be upgraded using the online upgrade method.

What happens if there is a failure during the cluster creation process?

It is a best practice to snapshot primary platform and proxies before starting the cluster creation process. If there is a failure, delete the secondary platform nodes and recover primary platform and collector VMs from the snapshots.

What happens to the existing data and configuration when I expand the single node deployment to a cluster?

All data and configuration is maintained without any change. The data will be accessible after cluster creation.

Can you have platform VM in different regions?

No, we require the Platform nodes to co-located be in the same site. The collector servers can be geo-distributed.

Can platform hosted on vSAN Stretch clusters (2 Datacenters …)?

Yes, vSAN clusters within same or across datacenters would still ensure certain IO performance like local storage.

Can we host cluster nodes on different vSAN Clusters?

Yes, Different nodes of a Platform cluster could be hosted on different underlying datastores.

Do you need to backup platform nodes?

Yes, backups ust be taken using VMware recommended snapshot/backup technologies.

How to estimate the bandwidth between the cluster collector VM on a region and the platform VM cluster on another region?

In some large deployments, we have seen this number ranging from 1 mbps to 20 mbps. There is much of deduplication or compression that happens in collector VM before data is sent to platform VM.

How much network traffic will be between cluster node?

Traffic usually depends on size of cluster & type of datacenter environment.

For installations with 30-50k VMs:
  • Between clusters: 50-400Mbps approx.
  • Between collector & platform: 100Kbps-15Mbps approx.

What is the maximum admissible latency between nodes in a cluster?

The platform nodes have to be co-located in the same site. In such cases, the latency is minimal. If the platform nodes are hosted on vSAN stretch clusters (two data centers), the vSAN clusters within or across the clusters ensure certain IO performance like local storage. The applications running on data centers such as vRealize Network Insight work fine. You can host different nodes of a platform cluster on different underlying datastores. But you need to ensure that all the platform VMs in a cluster are co-located within the same site.

What is the maximum admissible latency between the collector VMs on a region and the platform VM cluster on another region?

You can have geo-distributed proxies in your setup. There is an HTTPS connection from collector VM to platform VM so it can tolerate high latencies, to order of few seconds. vRealize Network Insight supports maximum of 10 nodes in a cluster (30,000 VMs w/ flows Or 50,000 VMs without flows).

What should be size of collector/platform VM?

Use large brick configuration: refer installation guide.