HA-Based System Requirements

If you are deploying VMware Telco Cloud Service Assurance in the High Availability (HA) mode, ensure that your system meets the following requirements and the deployment scaling requirements.

Footprint Specification

The following tables specify the requirements for deploying different footprints. The number of VMs required for each type of cluster is provided. This is particularly important when HA is required. The following tables provides the number of virtual CPUs, main memory (RAM), and the total disk size for each virtual machine.

Table 1. Footprint for VMware Tanzu Kubernetes Grid Management Cluster
Footprint Size	Number of VMs	vCPU Per VM	RAM Per VM (GBs)	Role
25 K	3	2	8	Control Plane Node
25 K	2	2	8	Worker Node
50 K	3	4	16	Control Plane Node
50 K	2	4	16	Worker Node
100 K	3	4	16	Control Plane Node
100 K	2	4	16	Worker Node

Note: The table shows the VMware Tanzu Kubernetes Grid management cluster sizing for deployments when a dedicated VMware Tanzu Kubernetes Grid management cluster is used for the VMware Tanzu Kubernetes Grid workload cluster in VMware Telco Cloud Service Assurance. To size deployments when multiple workload clusters are managed by a single management cluster, see VMware Tanzu Kubernetes Grid Documentation.

Table 2. Footprint for VMware Tanzu Kubernetes Grid Workload Cluster
Footprint Size	Number of VMs	vCPU Per VM	RAM Per VM (GBs)	Local Disk Per VM (GBs)	Total Persistent Volume Storage (TBs) for 1W Retention Interval	Role
25 K	3	2	8	50	NA	Control Plane Node
25 K	9	16	64	200	13.5	Worker Node
50 K	3	4	16	50	NA	Control Plane Node
50 K	14	16	64	200	19	Worker Node
100 K	3	4	16	50	NA	Control Plane Node
100 K	20	16	64	200	35.5	Worker Node

Note: For VMware Tanzu Kubernetes management and workload cluster, it is recommended to enable vSphere HA for VMWare cluster.

Table 3. Footprint for AKS Workload Cluster
Footprint Size	Number of VMs	vCPU Per VM	RAM Per VM (GBs)	Local Disk Per VM (GBs)	Total Persistent Volume Storage (TBs) for 1W Retention Interval
25 K	9	16	64	200	13.5
50 K	14	16	64	200	19
100 K	20	16	64	200	35.5

Note: By default in AKS, first three worker nodes can also be the control plane nodes.

For 25 K, 50 K, and 100 K footprints, the recommended AKS VM size template is Standard_D16s_v3.

Raw Metric Retention Interval

The following table provides total persistent volume storage in Terra Bytes (TBs) required for 25 K, 50 K, and 100 K footprints with raw metrics based on the retention period, which varies from 1 week through 7 weeks.

Note: The default raw metrics retention interval is 1w and is configurable. If you decide to change the retention interval period, the persistent volume storage must change accordingly.


Footprint	Raw Metric Retention Interval in Weeks
Footprint	PV Storage (TBs) for 1W	PV Storage (TBs) for 2W	PV Storage (TBs) for 3W	PV Storage (TBs) for 4W	PV Storage (TBs) for 5W	PV Storage (TBs) for 6W	PV Storage (TBs) for 7W
25 K	13.5	13.5	13.5	19.2	19.2	19.2	19.2
50 K	19	24.2	24.2	29.6	29.6	36	36
100 K	35.5	40.5	45.8	51.5	56.8	62.5	69

For example, if the raw metrics for the retention period is 4 weeks for a 50 K footprint, make sure that 29.6 TB persistent volume storage is provisioned. During the deployment time, specify the retention period as 4 weeks.

Performance and Scalability for Different Deployments

The following table provides sample managed capacity for each footprint.


Footprint	Small (HA) 25 K	Small-Medium (HA) 50 K	Medium (HA) 100 K
Number of devices	25 K	50 K	100 K
Number of unique events or notifications per day	25 K	50 K	100 K
Number of metrics per five minutes	10 million	20 million	40 million
Number of routers or switches	15 K	30 K	60 K
Managed P and I	300 K	600 K	1.2 million
Number of hosts	2 K	4 K	8 K
Number of VMs	5 K	10 K	20 K
Number of CNFs	10 K	10 K	10 K
Number of pods	10 K	10 K	10 K
Total number of events=number of devices*4+external events	105 K	205 K	500 K
Number of raw metrics from Domain Manager metric collector per five minutes polling interval	9 million	14 million	29 million
Number of Cisco ACI control cluster supported	10	10	10
Number of IPSLA routers discovered per collector	100	100	100
Number of Analytics jobs supported	10	10	10
Number of Alarming jobs supported	10	10	10
Total number of alerts supported per minute	100	100	100
Kafka to Kafka collector metrics per five minutes polling interval	1 million	6 million	11 million
Total number of concurrent APIs	100	100	100
Number of concurrent users	10	10	10
Total number of users	200	200	200
Maximum number of events from VMware vROps to VMware Telco Cloud Service Assurance per five minutes polling interval	6 K	6 K	6 K
Number of notifications processed per second	350	450	450
Data synchronization of Topology in VMware Telco Cloud Service Assurance UI	6 minutes	8 minutes	10 minutes
Number of metrics that can be exported to external Kafka per five minutes polling interval	10 Million	20 Million	40 Million
Bandwidth utilization for storage traffic	33 Mbps	65 Mbps	135 Mbps
Total Disk IOPS (Read + Write)	1000	2000	4000


Footprint	Small (HA) 25 K	Small-Medium (HA) 50 K	Medium (HA) 100 K
Native traffic flow metrics	1.5 K	2.5 K	5 K

Note: For native traffic flow, scale support mentioned in the table considers that no other source of metric data is flowing to VMware Telco Cloud Service Assurance. If other source of metric data is flowing into VMware Telco Cloud Service Assurance, then reduce the ratio of traffic flow data.