The gp_percentile_agg
module introduces improved VMware Greenplum Query Optimizer (GPORCA) performance for ordered-set aggregate functions including percentile_cont()
, percentile_disc()
, and median()
. These improvements particularly benefit MADlib, which internally invokes these functions.
GPORCA generates a more performant query plan when:
<fraction>
provided to the function is a const
and not an ARRAY
.GROUP BY
clause.The gp_percentile_agg
module is a Greenplum Database extension.
The gp_percentile_agg
module is installed when you install Greenplum Database. You must register the gp_percentile_agg
extension in each database where you want to use the module:
CREATE EXTENSION gp_percentile_agg;
Refer to Installing Additional Supplied Modules for more information.
If you upgraded from and used the gp_percentile_agg
module in Greenplum Database version 6.21.x, you must upgrade the module to obtain the bug fix and improvements introduced in Greenplum Database version 6.22.0.
To upgrade, drop and recreate the gp_percentile_agg
extension in each database in which you are using the module:
DROP EXTENSION gp_percentile_agg;
CREATE EXTENSION gp_percentile_agg;
To realize the GPORCA performance benefits when using ordered-set aggregate functions, in addition to registering the extension you must also enable the optimizer_enable_orderedagg server configuration parameter before you run the query. For example, to enable this parameter in a psql
session:
SET optimizer_enable_orderedagg = on;
When the extension is registered, optimizer_enable_orderedagg
is enabled, and you invoke the percentile_cont()
, percentile_disc()
, or median()
functions, GPORCA generates the more performant query plan.
Refer to Ordered-Set Aggregate Functions in the PostgreSQL documentation for more information about using ordered-set aggregates.