This topic provides guidance on how to size your VMware vSphere environment based on the size of your expected Greenplum database for both single-tier architecture and two-tier architecture.
A functional unit represents the resources required by a Greenplum segment. This definition assumes an environment with the following characteristics:
Resources per Primary Segment per 5 Concurrent Users | Quantity per Greenplum Segment | Ratio (to vCPU) |
---|---|---|
vCPU | 8 | |
Memory | 32 GB | 4 GB : 1 vCPU |
Useable Storage | 1024 GiB | 128 GB : 1 vCPU |
Network (Interconnect Only) | 4 Gbps | 0.5 Gbps : 1 vCPU |
Storage Read IO | 300 MB/s | 38 MB/s : 1 vCPU |
Storage Write IO | 300 MB/s | 38 MB/s : 1 vCPU |
In the table above, Useable Storage is the minimum storage per segment. You can specify more if required. The values for Network, Storage Read, and Write IO must be set per the ratio above for a Greenplum segment.
For Network, VMware recommends at least 2 100GbE NICs, to ensure network redundancy. For smaller deployments, 25GbE NICs is acceptable.
The physical host must have at least 10 disk drives in order to use vSAN ESA.
Unlike planning for a new environment, where you have more flexibility with resources, deploying Greenplum to an existing vSphere environment requires you to work with the available resources within the vSphere environment.
However, as with Planning for a Deployment, you can use the functional unit to gauge the number of Greenplum segments that you can deploy within the vSphere cluster and split the segments evenly as much as possible per Greenplum segment host.
The following table contains guidelines for choosing between RAID-1 (mirroring) and RAID-5 (erasure coding) for the VMware vSphere storage policies that you configure later. These parameters have been calculated to ensure Greenplum fault tolerance against host failure.
Note RAID-5 has been optimized for vSAN ESA, and is recommended for most systems that use vSAN ESA.
RAID Type | ESXi Hosts | Space Overhead |
---|---|---|
RAID-1 (mirroring) | 4 | 2x |
RAID-5 (erasure coding) |
4 to 5 | 1.5x |
RAID-5 (erasure coding) |
6 or more | 1.25x |
RAID Type | ESXi Hosts | Space Overhead |
---|---|---|
RAID-1 (mirroring) | 4 | 2x |
RAID-5 (erasure coding) |
5 or more | 1.33x |
When planning a deployment:
Make a rough estimate of the expected data size that will be stored in Greenplum. You will use this estimate to calculate the number of functional units required.
For each functional unit, reserve 30% of the Usable Storage for the temporary and transaction files required by VMware Greenplum. You must take this into account when calculating the number of functional units needed for the expected data size.
Determine the hardware specifications for a physical host (mainly the logical cores, memory, and raw capacity). Use this to determine the number of Greenplum segment hosts (virtual machines) that can fit on each physical host.
The size of the Greenplum segment host will depend on the number of Greenplum segments per host. VMware recommends 2 to 4 Greenplum segments per host. The examples below assume 3 Greenplum segments per host.
Your hardware specifications must be based on one of the supported architecture choices: single-tier architecture and two-tier architecture. For more information about these architecture choices, see the following sections.
Single-tier architecture must take into account the CPU, memory, and disk overhead for vSAN. See the following table for the amount of each resource required by vSAN.
Resource | Used by vSAN Overhead |
---|---|
CPU | 10% |
Memory | 20% |
Raw Storage Capacity | 30% |
Take these requirements into account when assigning Greenplum segment hosts to a physical host. Additionally, you must reduce the raw storage capacity by the RAID overhead, which ranges from 1.25x to 2x. For details on the RAID overhead, see Determining the RAID Type.
After determining the number of segment hosts per physical host, you can horizontally scale by adding physical hosts until the number of segments accomodates the expected data size.
Note To support fault tolerance, you must specify 4 physical hosts at minimum.
This section describes example VMware Greenplum deployments for database sizes of 10 TiB, 100 TiB, and 1 PiB. These deployment examples are based on a physical host with the following hardware specifications.
Component | Configuration |
---|---|
Logical Cores | 152 |
RAM | 1024 GB |
Raw Capacity Storage | 32 TB |
Use the following specifications for an expected Greenplum database size of 10 to 40 TiB.
Deployment Type | Mirrorless | Mirrored |
---|---|---|
Number of Physical Hosts | 4 | 8 |
Total System vCPU | 608 | 1216 |
Total System Memory | 4096 GB | 8192 GB |
Total System Raw Storage | 128 TiB | 256 Tib |
vCPUs / Memory per Greenplum Segment Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
Disks Size per Greenplum Segment Host (VM) | 3122 GiB | 3122 Gib |
Total Number of Greenplum Segment Host (VM) | 20 | 40 |
Total Number of Greenplum Segments (3 per VM) | 60 | 120 |
vCPUs / Memory Greenplum Coordinator Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
vCPUs / Memory Standby Coordinator Host (VM) | - | 24 & 96 GiB |
Disk Size for Coordinator/Standby Host (VM) | 3122 GiB | 3122 GiB |
Total Greenplum vCPUs | 480 | 960 |
Total Greenplum Memory | 1920 GB | 3840 GB |
Use the following specifications for an expected Greenplum database size of 100 to 120 TiB.
Deployment Type | Mirrorless | Mirrored |
---|---|---|
Number of Physical Hosts | 12 | 24 |
Total System vCPU | 1824 | 3648 |
Total System Memory | 12288 GB | 24576 GB |
Total System Raw Storage | 384 TiB | 768 Tib |
vCPUs / Memory per Greenplum Segment Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
Disks Size per Greenplum Segment Host (VM) | 3122 GiB | 3122 Gib |
Total Number of Greenplum Segment Host (VM) | 58 | 116 |
Total Number of Greenplum Segments (3 per VM) | 174 | 348 |
vCPUs / Memory Greenplum Coordinator Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
vCPUs / Memory Standby Coordinator Host (VM) | - | 24 & 96 GiB |
Disk Size for Coordinator/Standby Host (VM) | 3122 GiB | 3122 GiB |
Total Greenplum vCPUs | 1392 | 2784 |
Total Greenplum Memory | 5568 GB | 11136 GB |
Use the following specifications for an expected Greenplum database size of 1 PiB.
Deployment Type | Mirrorless | Mirrored |
---|---|---|
Number of Physical Hosts | 98 | 196 |
Total System vCPU | 14896 | 29792 |
Total System Memory | 100352 GB | 200704 GB |
Total System Raw Storage | 3136 TiB | 6272 Tib |
vCPUs / Memory per Greenplum Segment Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
Disks Size per Greenplum Segment Host (VM) | 3122 GiB | 3122 Gib |
Total Number of Greenplum Segment Host (VM) | 488 | 976 |
Total Number of Greenplum Segments (3 per VM) | 1464 | 2928 |
vCPUs / Memory Greenplum Coordinator Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
vCPUs / Memory Standby Coordinator Host (VM) | - | 24 & 96 GiB |
Disk Size for Coordinator/Standby Host (VM) | 3122 GiB | 3122 GiB |
Total Greenplum vCPUs | 11712 | 23424 |
Total Greenplum Memory | 46848 GB | 93696 GB |
After determining the number of segment hosts per physical host, you can horizontally scale by adding physical hosts until the number of Greenplum segments accomodates the expected data size.
Note To support fault tolerance, you must specify 4 physical hosts at minimum.
This section describes example VMware Greenplum deployments for database sizes of 10 TiB, 100 TiB, and 1 PiB. These deployment examples are based on a physical host with the following hardware specifications.
Description | Configuration |
---|---|
Logical Cores | 152 |
RAM | 1024 GB |
Raw Capacity Storage | 32 TB |
As this is a two-tier architecture, useable storage (the Datastore Volume) is not dependent on the physical host. If high availability (HA) and RAID configuration are already configured at the external storage array, the Datastore Volume should be accessible as shared storage across all of the physical hosts.
In the specifications below, 30% of the Datastore Volume is reserved for Greenplum operations and 30% of the Datastore Volume is reserved for vSphere operations.
Use the following specifications for an expected Greenplum database size of 10 to 40 TiB.
Deployment Type | Mirrorless | Mirrored |
---|---|---|
Number of Physical Hosts | 4 | 8 |
Total System vCPU | 608 | 1216 |
Total System Memory | 4096 GB | 8192 GB |
Total System Raw Storage | 90 TiB | 180 Tib |
vCPUs / Memory per Greenplum Segment Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
Disks Size per Greenplum Segment Host (VM) | 3000 GiB | 3000 Gib |
Total Number of Greenplum Segment Host (VM) | 20 | 40 |
Total Number of Greenplum Segments (3 per VM) | 60 | 120 |
vCPUs / Memory Greenplum Coordinator Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
vCPUs / Memory Standby Coordinator Host (VM) | - | 24 & 96 GiB |
Disk Size for Coordinator/Standby Host (VM) | 1000 GiB | 1000 GiB |
Total Greenplum vCPUs | 480 | 960 |
Total Greenplum Memory | 1920 GB | 3840 GB |
Use the following specifications for an expected Greenplum database size of 100 to 120 TiB.
Deployment Type | Mirrorless | Mirrored |
---|---|---|
Number of Physical Hosts | 12 | 24 |
Total System vCPU | 1824 | 3648 |
Total System Memory | 12288 GB | 24576 GB |
Total System Raw Storage | 261 TiB | 522 Tib |
vCPUs / Memory per Greenplum Segment Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
Disks Size per Greenplum Segment Host (VM) | 3153 GiB | 3153 Gib |
Total Number of Greenplum Segment Host (VM) | 58 | 116 |
Total Number of Greenplum Segments (3 per VM) | 174 | 348 |
vCPUs / Memory Greenplum Coordinator Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
vCPUs / Memory Standby Coordinator Host (VM) | - | 24 & 96 GiB |
Disk Size for Coordinator/Standby Host (VM) | 1051 GiB | 1051 GiB |
Total Greenplum vCPUs | 1392 | 2784 |
Total Greenplum Memory | 5568 GB | 11136 GB |
Use the following specifications for an expected Greenplum database size of 1 PiB.
Deployment Type | Mirrorless | Mirrored |
---|---|---|
Number of Physical Hosts | 98 | 196 |
Total System vCPU | 14896 | 29792 |
Total System Memory | 100352 GB | 200704 GB |
Total System Raw Storage | 2200 TiB | 4400 Tib |
vCPUs / Memory per Greenplum Segment Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
Disks Size per Greenplum Segment Host (VM) | 3120 GiB | 3120 Gib |
Total Number of Greenplum Segment Host (VM) | 488 | 976 |
Total Number of Greenplum Segments (3 per VM) | 1464 | 2928 |
vCPUs / Memory Greenplum Coordinator Host (VM) | 24 & 96 GiB | 24 & 96 GiB |
vCPUs / Memory Standby Coordinator Host (VM) | - | 24 & 96 GiB |
Disk Size for Coordinator/Standby Host (VM) | 1040 GiB | 1040 GiB |
Total Greenplum vCPUs | 11712 | 23424 |
Total Greenplum Memory | 46848 GB | 93696 GB |
For more information and to make more precise calculations based on your requirements, see Greenplum Storage Sizer for Two-Tiered Architectures.
You must correctly size the Greenplum segment hosts to handle the database workloads according to the cluster specifications.
VMware recommends that you size your Greenplum segment host virtual hardware as a multiple of the functional unit. Multiply the size of the functional unit by the number of Greenplum segments per segment host.
After you determine the following:
See Platform Requirements to prepare for the installation.