This topic provides guidance on how to size your VMware vSphere environment based on the size of your expected Greenplum database for both single-tier architecture and two-tier architecture.

The Functional Unit

A functional unit represents the resources required by a Greenplum segment. This definition assumes an environment with the following characteristics:

  • Analytical-type workload (similar to TPC-DS)
  • No overprovisioning
  • 5 concurrent users
  • 1 vCPU per 1 logical core
Resources per Primary Segment per 5 Concurrent Users Quantity per Greenplum Segment Ratio (to vCPU)
vCPU 8
Memory 32 GB 4 GB : 1 vCPU
Useable Storage 1024 GiB 128 GB : 1 vCPU
Network (Interconnect Only) 4 Gbps 0.5 Gbps : 1 vCPU
Storage Read IO 300 MB/s 38 MB/s : 1 vCPU
Storage Write IO 300 MB/s 38 MB/s : 1 vCPU

In the table above, Useable Storage is the minimum storage per segment. You can specify more if required. The values for Network, Storage Read, and Write IO must be set per the ratio above for a Greenplum segment.

For Network, VMware recommends at least 2 100GbE NICs, to ensure network redundancy. For smaller deployments, 25GbE NICs is acceptable.

The physical host must have at least 10 disk drives in order to use vSAN ESA.

Deploying VMware Greenplum to an Existing vSphere Environment

Unlike planning for a new environment, where you have more flexibility with resources, deploying Greenplum to an existing vSphere environment requires you to work with the available resources within the vSphere environment.

However, as with Planning for a Deployment, you can use the functional unit to gauge the number of Greenplum segments that you can deploy within the vSphere cluster and split the segments evenly as much as possible per Greenplum segment host.

Determining the RAID Type

The following table contains guidelines for choosing between RAID-1 (mirroring) and RAID-5 (erasure coding) for the VMware vSphere storage policies that you configure later. These parameters have been calculated to ensure Greenplum fault tolerance against host failure.

Note RAID-5 has been optimized for vSAN ESA, and is recommended for most systems that use vSAN ESA.

vSAN ESA

RAID Type ESXi Hosts Space Overhead
RAID-1 (mirroring) 4 2x
RAID-5
(erasure coding)
4 to 5 1.5x
RAID-5
(erasure coding)
6 or more 1.25x

vSAN OSA

RAID Type ESXi Hosts Space Overhead
RAID-1 (mirroring) 4 2x
RAID-5
(erasure coding)
5 or more 1.33x

Planning for a Deployment

When planning a deployment:

  1. Make a rough estimate of the expected data size that will be stored in Greenplum. You will use this estimate to calculate the number of functional units required.

    For each functional unit, reserve 30% of the Usable Storage for the temporary and transaction files required by VMware Greenplum. You must take this into account when calculating the number of functional units needed for the expected data size.

  2. Determine the hardware specifications for a physical host (mainly the logical cores, memory, and raw capacity). Use this to determine the number of Greenplum segment hosts (virtual machines) that can fit on each physical host.

    The size of the Greenplum segment host will depend on the number of Greenplum segments per host. VMware recommends 2 to 4 Greenplum segments per host. The examples below assume 3 Greenplum segments per host.

    Your hardware specifications must be based on one of the supported architecture choices: single-tier architecture and two-tier architecture. For more information about these architecture choices, see the following sections.

Single-Tier Architecture

Single-tier architecture must take into account the CPU, memory, and disk overhead for vSAN. See the following table for the amount of each resource required by vSAN.

Resource Used by vSAN Overhead
CPU 10%
Memory 20%
Raw Storage Capacity 30%

Take these requirements into account when assigning Greenplum segment hosts to a physical host. Additionally, you must reduce the raw storage capacity by the RAID overhead, which ranges from 1.25x to 2x. For details on the RAID overhead, see Determining the RAID Type.

After determining the number of segment hosts per physical host, you can horizontally scale by adding physical hosts until the number of segments accomodates the expected data size.

Note To support fault tolerance, you must specify 4 physical hosts at minimum.

Deployment Examples

This section describes example VMware Greenplum deployments for database sizes of 10 TiB, 100 TiB, and 1 PiB. These deployment examples are based on a physical host with the following hardware specifications.

Component Configuration
Logical Cores 152
RAM 1024 GB
Raw Capacity Storage 32 TB

10 TiB Database

Use the following specifications for an expected Greenplum database size of 10 to 40 TiB.

Deployment Type Mirrorless Mirrored
Number of Physical Hosts 4 8
Total System vCPU 608 1216
Total System Memory 4096 GB 8192 GB
Total System Raw Storage 128 TiB 256 Tib
vCPUs / Memory per Greenplum Segment Host (VM) 24 & 96 GiB 24 & 96 GiB
Disks Size per Greenplum Segment Host (VM) 3122 GiB 3122 Gib
Total Number of Greenplum Segment Host (VM) 20 40
Total Number of Greenplum Segments (3 per VM) 60 120
vCPUs / Memory Greenplum Master Host (VM) 24 & 96 GiB 24 & 96 GiB
vCPUs / Memory Standby Master Host (VM) - 24 & 96 GiB
Disk Size for Master/Standby Host (VM) 3122 GiB 3122 GiB
Total Greenplum vCPUs 480 960
Total Greenplum Memory 1920 GB 3840 GB

100 TiB Database

Use the following specifications for an expected Greenplum database size of 100 to 120 TiB.

Deployment Type Mirrorless Mirrored
Number of Physical Hosts 12 24
Total System vCPU 1824 3648
Total System Memory 12288 GB 24576 GB
Total System Raw Storage 384 TiB 768 Tib
vCPUs / Memory per Greenplum Segment Host (VM) 24 & 96 GiB 24 & 96 GiB
Disks Size per Greenplum Segment Host (VM) 3122 GiB 3122 Gib
Total Number of Greenplum Segment Host (VM) 58 116
Total Number of Greenplum Segments (3 per VM) 174 348
vCPUs / Memory Greenplum Master Host (VM) 24 & 96 GiB 24 & 96 GiB
vCPUs / Memory Standby Master Host (VM) - 24 & 96 GiB
Disk Size for Master/Standby Host (VM) 3122 GiB 3122 GiB
Total Greenplum vCPUs 1392 2784
Total Greenplum Memory 5568 GB 11136 GB

1 PiB Database

Use the following specifications for an expected Greenplum database size of 1 PiB.

Deployment Type Mirrorless Mirrored
Number of Physical Hosts 98 196
Total System vCPU 14896 29792
Total System Memory 100352 GB 200704 GB
Total System Raw Storage 3136 TiB 6272 Tib
vCPUs / Memory per Greenplum Segment Host (VM) 24 & 96 GiB 24 & 96 GiB
Disks Size per Greenplum Segment Host (VM) 3122 GiB 3122 Gib
Total Number of Greenplum Segment Host (VM) 488 976
Total Number of Greenplum Segments (3 per VM) 1464 2928
vCPUs / Memory Greenplum Master Host (VM) 24 & 96 GiB 24 & 96 GiB
vCPUs / Memory Standby Master Host (VM) - 24 & 96 GiB
Disk Size for Master/Standby Host (VM) 3122 GiB 3122 GiB
Total Greenplum vCPUs 11712 23424
Total Greenplum Memory 46848 GB 93696 GB

Two-Tier Architecture

After determining the number of segment hosts per physical host, you can horizontally scale by adding physical hosts until the number of Greenplum segments accomodates the expected data size.

Note To support fault tolerance, you must specify 4 physical hosts at minimum.

Deployment Examples

This section describes example VMware Greenplum deployments for database sizes of 10 TiB, 100 TiB, and 1 PiB. These deployment examples are based on a physical host with the following hardware specifications.

Description Configuration
Logical Cores 152
RAM 1024 GB
Raw Capacity Storage 32 TB

As this is a two-tier architecture, useable storage (the Datastore Volume) is not dependent on the physical host. If high availability (HA) and RAID configuration are already configured at the external storage array, the Datastore Volume should be accessible as shared storage across all of the physical hosts.

In the specifications below, 30% of the Datastore Volume is reserved for Greenplum operations and 30% of the Datastore Volume is reserved for vSphere operations.

10 TiB Database

Use the following specifications for an expected Greenplum database size of 10 to 40 TiB.

Deployment Type Mirrorless Mirrored
Number of Physical Hosts 4 8
Total System vCPU 608 1216
Total System Memory 4096 GB 8192 GB
Total System Raw Storage 90 TiB 180 Tib
vCPUs / Memory per Greenplum Segment Host (VM) 24 & 96 GiB 24 & 96 GiB
Disks Size per Greenplum Segment Host (VM) 3000 GiB 3000 Gib
Total Number of Greenplum Segment Host (VM) 20 40
Total Number of Greenplum Segments (3 per VM) 60 120
vCPUs / Memory Greenplum Master Host (VM) 24 & 96 GiB 24 & 96 GiB
vCPUs / Memory Standby Master Host (VM) - 24 & 96 GiB
Disk Size for Master/Standby Host (VM) 1000 GiB 1000 GiB
Total Greenplum vCPUs 480 960
Total Greenplum Memory 1920 GB 3840 GB

100 TiB Database

Use the following specifications for an expected Greenplum database size of 100 to 120 TiB.

Deployment Type Mirrorless Mirrored
Number of Physical Hosts 12 24
Total System vCPU 1824 3648
Total System Memory 12288 GB 24576 GB
Total System Raw Storage 261 TiB 522 Tib
vCPUs / Memory per Greenplum Segment Host (VM) 24 & 96 GiB 24 & 96 GiB
Disks Size per Greenplum Segment Host (VM) 3153 GiB 3153 Gib
Total Number of Greenplum Segment Host (VM) 58 116
Total Number of Greenplum Segments (3 per VM) 174 348
vCPUs / Memory Greenplum Master Host (VM) 24 & 96 GiB 24 & 96 GiB
vCPUs / Memory Standby Master Host (VM) - 24 & 96 GiB
Disk Size for Master/Standby Host (VM) 1051 GiB 1051 GiB
Total Greenplum vCPUs 1392 2784
Total Greenplum Memory 5568 GB 11136 GB

1 PiB Database

Use the following specifications for an expected Greenplum database size of 1 PiB.

Deployment Type Mirrorless Mirrored
Number of Physical Hosts 98 196
Total System vCPU 14896 29792
Total System Memory 100352 GB 200704 GB
Total System Raw Storage 2200 TiB 4400 Tib
vCPUs / Memory per Greenplum Segment Host (VM) 24 & 96 GiB 24 & 96 GiB
Disks Size per Greenplum Segment Host (VM) 3120 GiB 3120 Gib
Total Number of Greenplum Segment Host (VM) 488 976
Total Number of Greenplum Segments (3 per VM) 1464 2928
vCPUs / Memory Greenplum Master Host (VM) 24 & 96 GiB 24 & 96 GiB
vCPUs / Memory Standby Master Host (VM) - 24 & 96 GiB
Disk Size for Master/Standby Host (VM) 1040 GiB 1040 GiB
Total Greenplum vCPUs 11712 23424
Total Greenplum Memory 46848 GB 93696 GB

For more information and to make more precise calculations based on your requirements, see Greenplum Storage Sizer for Two-Tiered Architectures.

Sizing the Greenplum Segment Host (Virtual Machine)

You must correctly size the Greenplum segment hosts to handle the database workloads according to the cluster specifications.

VMware recommends that you size your Greenplum segment host virtual hardware as a multiple of the functional unit. Multiply the size of the functional unit by the number of Greenplum segments per segment host.

Next Steps

After you determine the following:

  • The hardware model and version
  • The number of EXSi hosts (based on your Greenplum Database capacity)
  • The number of virtual machines and available resources required in your VMware vSphere environment

See Prerequisites to prepare for the installation.

check-circle-line exclamation-circle-line close-line
Scroll to top icon