Consider all applicable scalability factors when configuring your vRealize Automation system.

Users

The vRealize Automation appliance is configured for syncing less than 100,000 users. If you need to sync more than 100,000 users, increase the appliance memory by 2 GB.

Concurrent Provisions Scalability

By default, vRealize Automation processes only two concurrent provisions per endpoint. For information about increasing this limit, see Configuring Concurrent Machine Provisioning.

VMware recommends that all deployments start with at least two DEM-Workers. In 6.x each DEM-Worker could process 15 workflows concurrently. This has been increased to 30 in 7.0.

If machines are being customized through Workflow Stubs, you should have 1 DEM-Worker per 20 Machines that will be provisioned concurrently. For example, a system supporting 100 concurrent provisions should have a minimum of 5 DEM-Workers.

For more information on DEM-Workers and scalability see Distributed Execution Manager Performance Analysis and Tuning

Data Collection Scalability

Data collection completion time depends on the compute resource capacity, the number of machines on the compute resource or endpoint, the current system, and network load, among other variables. The performance scales at a different rate for different types of data collection.

Each type of data collection has a default interval that you can override or modify. Infrastructure administrators can manually initiate data collection for infrastructure source endpoints. Fabric administrators can manually initiate data collection for compute resources. The following values are the default intervals for data collection.

Table 1. Data Collection Default Intervals

Data Collection Type

Default Interval

Inventory

Every 24 hours (daily)

State

Every 15 minutes

Performance

Every 24 hours (daily)

Performance Analysis and Tuning

As the number of resources collecting data increases, data collection completion times might become longer than the interval between data collection intervals, particularly for state data collection. To determine whether data collection for a compute resource or endpoint is completing in time or is being queued, see the Data Collection page. The Last Completed field value might show In queue or In progress instead of a timestamp when data collection last finished. If this problem occurs, you can increase the interval between data collections to decrease the data collection frequency.

Alternatively, you can increase the concurrent data collection limit per agent. By default, vRealize Automation limits concurrent data collection activities to two per agent and queues requests that exceed this limit. This limitation allows data collection activities to finish quickly without affecting overall performance. You can raise the limit to take advantage of concurrent data collection, but you must weigh this option against overall performance degradation.

If you increase the configured vRealize Automation per-agent limit, you might want to increase one or more of these execution timeout intervals. For more information about how to configure data collection concurrency and timeout intervals, see the vRealize Automation System Administration documentation. Manager Service data collection is CPU-intensive. Increasing the processing power of the Manager Service host can decrease the time required for overall data collection.

Data collection for Amazon Elastic Compute Cloud (Amazon AWS), in particular, can be CPU intensive, especially if your system collects data on multiple regions concurrently and if data was not previously collected on those regions. This type of data collection can cause an overall degradation in Web site performance. Decrease the frequency of Amazon AWS inventory data collection if it is having a noticeable effect on performance.

Workflow Processing Scalability

The average workflow processing time, from when the DEM Orchestrator starts preprocessing the workflow to when the workflow finishes executing, increases with the number of concurrent workflows. Workflow volume is a function of the amount of vRealize Automation activity, including machine requests and some data collection activities.