Managing Greenplum Database resources with resource groups.
Memory, CPU, and concurrent transaction management have a significant impact on performance in a Greenplum Database cluster. Resource groups are a newer resource management scheme that enforce memory, CPU, and concurrent transaction limits in Greenplum Database.
While it is not always possible to increase system memory, you can avoid many out-of-memory conditions by configuring resource groups to manage expected workloads.
The following operating system and Greenplum Database memory settings are significant when you manage Greenplum Database resources with resource groups:
vm.overcommit_memory
This Linux kernel parameter, set in /etc/sysctl.conf
, identifies the method that the operating system uses to determine how much memory can be allocated to processes. vm.overcommit_memory
must always be set to 2 for Greenplum Database systems.
vm.overcommit_ratio
This Linux kernel parameter, set in /etc/sysctl.conf
, identifies the percentage of RAM that is used for application processes; the remainder is reserved for the operating system. Tune the setting as necessary. If your memory utilization is too low, increase the value; if your memory or swap usage is too high, decrease the setting.
gp_resource_group_memory_limit
The percentage of system memory to allocate to Greenplum Database. The default value is .7 (70%).
gp_resource_group_enable_recalculate_query_mem
By default, Greenplum Database calculates the maximum per-query memory allotment for all hosts using the memory configuration of, and the number of primary segments configured on, the master host.
NoteThe default behavior may lead to out of memory issues and underutilization of resources when the hardware configuration of the master and segment hosts differ.
If the hardware configuration of your master and segment hosts differ, set the gp_resource_group_enable_recalculate_query_mem
server configuration parameter to true
; this prompts Greenplum Database to recalculate the maximum per-query memory allotment on each segment host based on the memory and the number of primary segments configured on that segment host.
gp_workfile_limit_files_per_query
Set gp_workfile_limit_files_per_query
to limit the maximum number of temporary spill files (workfiles) allowed per query. Spill files are created when a query requires more memory than it is allocated. When the limit is exceeded the query is terminated. The default is zero, which allows an unlimited number of spill files and may fill up the file system.
gp_workfile_compression
If there are numerous spill files then set gp_workfile_compression
to compress the spill files. Compressing spill files may help to avoid overloading the disk subsystem with IO operations.
memory_spill_ratio
Set memory_spill_ratio
to increase or decrease the amount of query operator memory Greenplum Database allots to a query. When memory_spill_ratio
is larger than 0, it represents the percentage of resource group memory to allot to query operators. If concurrency is high, this memory amount may be small even when memory_spill_ratio
is set to the max value of 100. When you set memory_spill_ratio
to 0, Greenplum Database uses the statement_mem
setting to determine the initial amount of query operator memory to allot.
statement_mem
When memory_spill_ratio
is 0, Greenplum Database uses the statement_mem
setting to determine the amount of memory to allocate to a query.
Other considerations:
Available memory for resource groups may be limited on systems that use low or no swap space, and that use the default vm.overcommit_ratio
and gp_resource_group_memory_limit
settings. To ensure that Greenplum Database has a reasonable per-segment-host memory limit, you may be required to increase one or more of the following configuration settings:
vm.overcommit_ratio
setting.gp_resource_group_memory_limit
setting.Greenplum Database resource groups provide a powerful mechanism for managing the workload of the cluster. Consider these general guidelines when you configure resource groups for your system:
SUPERUSER
privileges runs under the default resource group named admin_group
. Keep this in mind when scheduling and running Greenplum administration utilities.default_group
.CONCURRENCY
resource group parameter to limit the number of active queries that members of a particular resource group can run concurrently.MEMORY_LIMIT
and MEMORY_SPILL_RATIO
parameters to control the maximum amount of memory that queries running in the resource group can consume.MEMORY_LIMIT
s) to a global shared memory pool. This memory is available to all queries on a first-come, first-served basis.gp_toolkit
views to examine resource group resource usage and to monitor how the groups are working.A low statement_mem
setting (for example, in the 10MB range) has been shown to increase the performance of queries with low memory requirements. Use the memory_spill_ratio
and statement_mem
server configuration parameters to override the setting on a per-query basis. For example:
SET memory_spill_ratio=0;
SET statement_mem='10 MB';
The default resource group for database transactions initiated by Greenplum Database SUPERUSER
s is the group named admin_group
. The default CONCURRENCY
value for the admin_group
resource group is 10.
Certain Greenplum Database administrative utilities may use more than one CONCURRENCY
slot at runtime, such as gpbackup
that you invoke with the --jobs
option. If the utility(s) you run require more concurrent transactions than that configured for admin_group
, consider temporarily increasing the group's MEMORY_LIMIT
and CONCURRENCY
values to meet the utility's requirement, making sure to return these parameters back to their original settings when the utility completes.
NoteMemory allocation changes that you initiate with
ALTER RESOURCE GROUP
may not take affect immediately due to resource consumption by currently running queries. Be sure to alter resource group parameters in advance of your maintenance window.
Parent topic: Greenplum Database Best Practices