gpmetrics Schema Reference

Greenplum Command Center creates the gpmetrics schema in the Vmware Greenplum gpperfmon to save alert rules and logs, and historical metrics collected by the VMware Greenplum metrics collection system. The gpmetrics schema contains the following tables and user-defined functions:

Tables

gpcc_alert_rule — saves alert rules configured on the Command Center Admin> Alerts page.
gpcc_alert_history — records an event when an alert rule is triggered.
gpcc_database_history — saves summary query activity information.
gpcc_department — saves department information.
gpcc_disk_history — saves disk usage statistics for each VMware Greenplum host file system.
gpcc_export_log — saves the log notifications from every “EXPORT ALL” user action.
gpcc_gpss_job — saves information about data loading jobs.
gpcc_gpss_log — saves logging-related information about data loading jobs.
gpcc_index_info — saves information about table indexes, particularly their bloat size.
gpcc_pg_log_history — saves history from the VMware Greenplum pg_log log file.
gpcc_plannode_history — saves plan node execution statistics for completed queries.
gpcc_queries_history — saves execution metrics for completed VMware Greenplum queries.
gpcc_queries_now — saves real-time query metrics data.
gpcc_role_department — saves information about roles within a department.
gpcc_resgroup_history — saves the history of the resource consumption of each resource group on each segment.
gpcc_scan_history — saves history for Recommendations scans.
gpcc_schedule — saves schedule for Recommendations scans.
gpcc_system_history — saves system metrics sampled from VMware Greenplum segments hosts.
gpcc_table_info — saves current statistics and size information for tables.
gpcc_table_info_history — saves daily statistics and size information for tables.
gpcc_wlm_rule — saves workload management rules.
gpcc_wlm_log_history — saves the log history of workload management rule actions.

User-Defined Functions

gpcc_delete_department — deletes a department.
gpcc_queries_per_hour — returns a variety of details about query activity per hour.
gpcc_queries_per_user — returns, for each user, the number of queries whose runtime is longer than the input interval, per hour, in the specified time range.
gpcc_queries_per_user_max_and_total_spill_size — returns, for each user, the total spill_size and maximum spill_size per query per hour.
gpcc_queries_per_user_max_cpu — returns, for each user, the query with the maximum segment and coordinator cpu usage per hour, along with details about the query.
gpcc_queries_per_user_max_run_time — returns, for each user, the longest running query per hour, along with details about the query.
gpcc_queries_per_user_max_skew — returns, for each user, the query with the maximum amount of processing skew in the system (skew_cpu) per hour, along with details about the query.
gpcc_queries_per_user_rows_out — returns, for each user, the query with the maximum rows_out per hour, along with details about the query.
gpcc_queries_top_10_cpu_per_resgroup — returns by resource group, the top ten queries utilizing the most CPU.
gpcc_queries_top_10_memory_per_resgroup — returns by resource group, the top ten queries utilizing the most memory.
gpcc_recommendations — returns a list of tables in need of a vacuum-related action such as VACUUM, VACUUM FULL, ANALYZE, and VACUUM FREEZE, as well as tables in need of the REINDEX action.
gpcc_system_per_hour — returns a variety of system information.
gpcc_update_department — renames a department.
gpcc_reindex_check — reports whether a table’s B-tree indices needs reindexing.

If you set the schema search path to include the gpmetrics schema, you do not have to qualify table and user-defined function names with the gpmetrics schema name. To set the default search path for the gpperfmon database enter this SQL command.

=# ALTER DATABASE gpperfmon SET search_path TO public,gpmetrics;

You must exit the current session and start a new session for the new search path to take effect.

Alert Tables

Command Center uses the gpcc_alert_rule and gpcc_alert_log tables to store the alert rules you set up in the user interface and to log messages when the alert rules are triggered.

gpcc_alert_rule

Note: Deprecated in Command Center 6.4/4.12.

The gpcc_alert_rule table records the alert rules configured in the Command Center user interface. It has the columns shown in the following table.

Column	Type	Description
rule_id	integer	Unique id for the rule.
rule_type	integer	Reserved for future use.
rule_description	character varying(512)	Text of the rule.
rule_config	json	JSON string containing parameters for user-specified values.
ctime	timestamp(0) without time zone	Time the rule was created.
etime	timestamp(0) without time zone	Time the rule became inactive, or null if rule is active.

The gpcc_alert_rule table keeps a history of alert rule configurations. When a rule becomes active, a new row is inserted and the ctime timestamp column is set to the current time; the etime timestamp is null, indicating that the rule is still active. When a rule is either disabled or superceded by a new rule, the etime timestamp column is set to the current time. Thus, the set of currently active rules is all rows where the etime column is null. A row that has timestamps in both ctime and etime columns is an historical record of the period of time during which the rule was active.

The rule_id column, a unique integer, is the distribution key for the table and is used to identify a single alert rule configuration. This column can be joined with the rule_id column in the gpcc_alert_log table to identify the rule that triggered each recorded alert event.

The rule_description column contains a string that describes the event that matches the rule. It is the text displayed in the Command Center UI for the rule, with user-specified values inserted.

The rule_config column contains a JSON string with parameters for the values entered for each of the rule’s fields in the Command Center UI.

gpcc_alert_history

The gpcc_alert_history table (wasgpcc_alert_log before Command Center 6.4) has the following columns:

Column	Type	Description
id	integer	Unique ID for the alert.
rule_id	integer	The ID of the rule that triggered this alert.
transaction_time	timestamp(0) without time zone	Time the alert was raised.
content	json	Contains parameters specifying values that triggered the alert.
config	json	Stores alert history from emails or notifications.

Where:

The id column, a unique integer, is the distribution key for the table.
The transaction_time column is set to the current time when a row is created.
The rule_id column can be joined with the rule_id column in the gpcc_alert_rule table to access details of the rule that triggered the alert.
The content column contains a JSON string with parameters specifying details about the event that triggered the alert. The JSON parameters vary with the type of the alert.
The config column stores the alert history received from the notification center or from emails.

The gpcc_alert_history table is an append-only, column-oriented table, partitioned by month on the transaction_time column. Command Center creates new partitions as needed and removes partitions over 12 months old.

A row is added to the gpcc_alert_history table whenever an alert rule is matched.

VMware Greenplum Metrics History Tables

The gpmetrics query history saves information collected by the Greenplum Database metrics collection system and forwarded to Greenplum Command Center.

The distribution key for each table is a ctime timestamp column, which is the time when the row is added to the database. The tables are partitioned by year and month, except for gpcc_queries_history and gpcc_plannode_history which are partitioned by day for optimal performance. Greenplum Command Center creates new partitions automatically as needed.

The history tables use append-optimized, column-oriented storage.

Command Center only saves queries with runtimes greater than the value of the min_query_time configuration parameter, found in the $HOME/gpmetrics/gpcc.conf configuration file on the host executing Command Center. The default, 0, saves all queries in the history table. This parameter can be configured on the Command Center Admin> Settings page.

gpcc_database_history

The gpcc_database_history table saves summary query activity metrics collected by the VMware Greenplum metrics collector. This data can be used to review the VMware Greenplum query load over time.

Column	Type	Description
ctime	timestamp(0) without time zone	Time the record was created.
queries_total	integer	Total number of queries running and queued to run.
queries_running	integer	Number of queries currently running.
queries_queued	integer	Number of queries queued, but not yet running.
queries_blocked	integer	The number of queries started, but blocked by other transactions.
queries_finished	integer	The number of queries that completed since the previous sampling interval.

gpcc_disk_history

The gpcc_disk_history table saves historical disk usage statistics for each VMware Greenplum segment host file system.

Column	Type	Description
ctime	timestamp(0) without time zone	Time the row was created.
hostname	character varying(64)	Name of the segment host.
filesystem	text	Path to the segment’s data directory.
total_bytes	bigint	Total size of the file system storage in bytes.
bytes_used	bigint	Number of storage bytes in
bytes_available	bigint	Number of storage bytes available.

gpcc_export_log

The gpcc_export_log table saves the log notifications from every “EXPORT ALL” user action. Whenever the user exports search results from the History or the Table Browser pages, this table gets updated.

Column	Type	Modifiers
id	integer	not null default nextval(‘gpcc_export_log_id_seq’::regclass)
ctime	timestamp(0) with time zone	not null default now()
source	character varying(64)	not null
filename	character varying(64)	not null
role	name	not null
req_params	json	not null
etime	timestamp(0) with time zone
status	character varying(64)
fail_msg	text

gpcc_index_info

Command Center uses the gpcc_index_info table to store information about table indices’ size, particularly their bloat size. This information is used to report when reindexing is recommended for a table.

Column	Type	Description
dbid	OID	The id of the table’s database; this value is related to the `pg_database` OID.
database	varchar(64)	The table’s database.
schema	varchar(64)	The table’s schema.
table_oid	OID	The table’s OID.
relation_name	varchar(64)	The table’s name.
index_oid	OID	The OID of the table’s index.
index_name	varchar(64)	The name of the table’s index.
real_size	bigint	The actual size of the index.
extra_size	bigint	The estimated extra size not used or needed by the index. This extra size is composed of the fillfactor, bloat, and alignment padding spaces.
fillfactor	int	The fillfactor of the index.
index_bloat_space	bigint	The estimated size of the bloat without the extra space set aside for the fillfactor.
index_bloat_rate	bigint	The estimated percentage of the real size used by the index bloat space.
is_na	boolean	If `true`, then the index bloat recorded by that row is “not accurate” and that index will not be included in the `REINDEX` recommendation.
latest_scan_ts	timestamp with timezone	The time of the last scan.
last_reindexed	timestamp with timezone	The time of the last reindexing.

gpcc_pg_log_history

The gpcc_pg_log_history table stores pg_log errors and warnings from the Greenplum Server log files.

Column	Type	Description
logtime	timestamp without time zone	Timestamp for this log.
loguser	text	Name of the role executing the query.
logdatabase	text	The database accessed.
logpid	text	Process id.
logthread	text	Thread number.
loghost	text	Host name or IP address of the host.
logport	text	Port number.
logsessiontime	timestamp without time zone	Session timestamp.
logtransaction	integer	Transaction id.
logsession	text	Session id.
logcmdcount	text	Command count.
logsegment	text	Segment number.
logslice	text	Slice number.
logdistxact	text	Distributed transation id.
loglocalxact	text	Location transacton id.
logsubxact	text	Subtransaction id.
logseverity	text	Log severity.
logstate	text	SQL State code associated with this log message.
logmessage	text	Log or error message text.
logdetail	text	Detail message text associated with an error message.
loghint	text	Hint message text associated with an error message.
logquery	text	Internally-generated query text.
logquerypos	integer	Index into the internally-generated query text.
logcontext	text	Context in which this message gets generated.
logdebug	text	Query string with full detail for debugging.
logcursorpos	integer	Cursor index into the query string.
logfunction	text	Function in which this message is generated.
logfile	text	Log file in which this message is generated.
logline	integer	Line in the log file in which this message is generated.
logstack	text	Full text of the stack trace associated with this message.

gpcc_plannode_history

The gpcc_plannode_history table saves detailed metrics for each operation (node) in a completed query plan. Each row contains metrics for one operation that executed on one VMware Greenplum segment. This information allows reconstructing the plan and execution metrics for a completed query.

Plan node history is only saved for queries that execute for 10 seconds or more.

Column	Type	Description
ctime	timestamp(0) without time zone	Time the row was created.
tmid	integer	A time identifier for the query. All records associated with a query will have the same `tmid`.
ssid	integer	Session id for the database connection. All records associated with the query will have the same `ssid`.
ccnt	integer	Command number within the session. All records associated with the query will have the same `ccnt`.
segid	integer	Id (`dbid`) of the segment for this plan node.
procid	integer	The postgres process ID for this plan node.
sliceid	integer	DEPRECATED. I̶d̶ o̶f̶ t̶h̶e̶ s̶l̶i̶c̶e̶ t̶h̶e̶ o̶p̶e̶r̶a̶t̶i̶o̶n̶ b̶e̶l̶o̶n̶g̶s̶ t̶o̶.̶ O̶p̶e̶r̶a̶t̶i̶o̶n̶s̶ t̶h̶a̶t̶ b̶e̶l̶o̶n̶g̶ t̶o̶ t̶h̶e̶ s̶a̶m̶e̶ s̶l̶i̶c̶e̶ e̶x̶e̶c̶u̶t̶e̶ i̶n̶ p̶a̶r̶a̶l̶l̶e̶l̶.̶
nodeid	integer	The query plan node ID for this operation.
parent_nodeid	integer	The parent query plan node ID from the query plan.
node_type	character varying(64)	Name of the operation type.
tinit	timestamp(6) without time zone	Time the operation was initialized.
tstart	timestamp(6) without time zone	Time the operation started.
tfinish	timestamp(6) without time zone	Time the operation finished.
status	character varying(16)	Status of the operation: `initialize`, `executing`, or `finished`.
planrows	double precision	The number of output rows estimated for the operation.
planwidth	integer	Width of output rows estimated for the operation.
start_cost	double precision	Number of page reads expected to produce first output row.
total_cost	double precision	Number of page reads expected to produce all output rows.
tuple_count	bigint
ntuples	bigint
nloops	bigint
first_tuple	timestamp(6) without time zone	Time the operation produced the first output row.
rel_oid	oid	Object ID of the output rows produced by the operation.
relation_name	character varying(64)	Name of the table this operation processed, if any.
index_name	character varying(64)	Name of the index used by this operation, if any.
alias_name	character varying(64)	Alias for the relation declared in the SQL command.
node_seq	integer	Node sequence
condition	text	Condition expression used for a filter or join operation.

gpcc_queries_history

The gpcc_queries_history table saves metrics for completed queries.

Column	Type	Description
access_tables_info	integer	The table OID of the query accessed.
ctime	timestamp(0) without time zone	Time the row was created.
tmid	integer	A time identifier for the query. All records associated with the query will have the same `tmid`.
ssid	integer	Session id for the database connection. All records associated with the query will have the same `ssid`.
ccnt	integer	Command number within this session. All records associated with the query will have the same `ccnt`.
username	character varying(64)	Role name that issued the query.
db	character varying(64)	Name of the database queried.
cost	double precision	Estimated cost to execute query, computed by the legacy planner or GPORCA.
tsubmit	timestamp(6) without time zone	Time the query was submitted.
tstart	timestamp(6) without time zone	Time the query was started.
tfinish	timestamp(6) without time zone	Time the query finished.
status	character varying(16)	Status of the query – `abort`, `error`, or `done`.
rows_out	bigint	Number of rows returned by the query.
error_msg	text	Error message, if the query failed.
plan_gen	character varying(16)	`PLANNER` if query plan was generated by the legacy planner; `OPTIMIZER` if the plan was generated by GPORCA, the Greenplum query optimizer.
query_hash	character varying(64)	Hash code generated from the text of the query.
query_text	text	Complete text of the query. Some queries may be reformatted before storing in the history database.
application_name	character varying(64)	Name of the client application that established the database connection.
rsqname	character varying(64)	If the `gp_resource_manager` configuration parameter is `queue`, the name of the resource queue managing the query.
rsgname	character varying(64)	If the `gp_resource_manager` configuration parameter is `group`, the name of the resource group managing the query.
cpu_master	bigint	Total CPU usage for this query on the VMware Greenplum coordinator instance.
cpu_segs	bigint	Total CPU usage for this query across all segments, measured in seconds. This is the sum of the CPU usage values taken from all active primary segments in the database array.
cpu_master_percent	double precision	Average CPU percent usage on the coordinator host during execution of this query.
cpu_segs_percent	double precision	Average CPU percent usage on the segment hosts during the execution of this query.
skew_cpu	double precision	Displays the amount of processing skew in the system for this query. Processing/CPU skew occurs when one segment performs a disproportionate amount of processing for a query. The skew is calculated from total CPU seconds used on all segments during the execution of the query.
skew_rows	double precision	Displays the amount of row skew in the system. Row skew occurs when one segment produces a disproportionate number of rows for a query.
memory	bigint	Total size of memory, in kilobytes, used by all segments to execute this query.
disk_read_bytes	bigint	Number of bytes read from disk.
disk_write_bytes	bigint	Number of bytes written to disk.
spill_size	bigint	Total size, in bytes, of spill files used by all segments to execute this query.
rqpriority	character varying(16)	Priority setting for the resource queue managing this query. Blank if resource group management is enabled.
query_tag	text	A key-value pair describing a query.
slices_metrics	JSON	The cpu/memory/disk metrics for a slice.
peak_memory	bigint	Maximum memory usage across all segments during the execution of the query, measured in KB.
node_sliceid	json	A map of node ID to slice ID.
lock_seconds	seconds	The number of seconds the query has been blocked.

gpcc_resgroup_history

The gpcc_resgroup_history table saves the history of the resource consumption of each resource group on each segment.

Column	Type	Description
ctime	timestamp(0) without time zone	Time the row was created.
rsgname	name	Resource group name.
host_name	text	The host name of the segment.
groupid	oid	Object ID of the resource group.
concurrency_limit	integer	The concurrency (`CONCURRENCY`) value specified for the resource group.
num_running	integer	The number of transactions currently executing in the resource group.
num_queueing	integer	The number of currently queued transactions for the resource group.
cpu_max_percent	integer	The CPU limit (`CPU_MAX_PERCENT` value) configured for the resource group, or -1.
cpu_weight	integer	The scheduling priority (`CPU_WEIGHT` value) configured for the resource group.
memory_limit	integer	The memory limit (`MEMORY_LIMIT` value) configured for the resource group, in MB.
cpu_usage	numeric	The percentage of CPU resources used on the segment host.
memory_usage	numeric	The memory used on the segment host, in MB.

gpcc_scan_history

Column	Type	Description
ctime	timestamp(0) without time zone	Time the row was created.
scan_id	integer	Scan id for this scan. See `schedule_id` column in the `gpcc_schedule`.
status	integer	Scan completion status. 0=Running, 1=Success, 2=Failed, 3=Cancelled, 4=Expired.
tables_scanned	bigint	The number of tables that were scanned.
start_dt	timestamp(0) without time zone	Time the scan began.
end_dt	timestamp(0) without time zone	Time the scan ended.

gpcc_schedule

The gpcc_schedule table contains details for scheduled events, such as the table scan for Recommendations. Each row describes a schedule in a JSON value. The ctime and etime columns together specify the period of time during which the schedule is enabled.

Column	Type	Description
schedule_id	integer	Id number for this schedule.
description	character varying(512)	Description of schedule.
schedule_config	json	JSON string with schedule details.
ctime	timestamp(0) without time zone	Time the schedule was added and enabled.
etime	timestamp(0) without time zone	Time the schedule was disabled.

gpcc_system_history

The gpcc_system_history table saves historical system metrics for each VMware Greenplum host, including the coordinator, standby coordinator, and segment hosts. The metrics include information about memory, CPU, disk, and network utilitization.

Column	Type	Description
cpu_iowait	double precision	The percentage of CPU used waiting on IO requests.
ctime	timestamp(0) without time zone	Time the row was created.
hostname	character varying(64)	Segment or coordinator hostname associated with these system metrics.
mem_total	bigint	Total system memory in Bytes for this host.
mem_used	bigint	System memory used, in Bytes, for this host.
mem_cached	bigint	Actual memory used, in Bytes, for this host (not including the memory reserved for cache and buffers).
mem_buffered	bigint	Free actual memory, in Bytes, for this host (not including the memory reserved for cache and buffers).
swap_total	bigint	Total swap space in Bytes for this host.
swap_used	bigint	Swap space used, in Bytes, for this host.
swap_page_in	bigint	Number of swap pages in.
swap_page_out	bigint	Number of swap pages out.
cpu_user	double precision	Percentage of time CPU processes execute in user mode.
cpu_sys	double precision	Percentage of time CPU processes execute in system (kernel) mode.
cpu_idle	double precision	Percentage idle CPU.
load0	double precision	CPU one-minute load average.
load1	double precision	CPU five-minute load average.
load2	double precision	CPU fifteen-minute load average.
quantum	seconds	Interval between metrics collections.
disk_ro_rate	bigint	Disk read operations per second.
disk_wo_rate	bigint	Disk write operations per second.
disk_rb_rate	bigint	Bytes per second for disk read operations.
disk_wb_rate	bigint	Bytes per second for disk write operations.
min_free_kbytes	kilobyte	The minimum amount of free memory that the kernel’s memory manager keeps on a host machine.
net_rp_rate	bigint	Packets per second on the system network for read operations.
net_wp_rate	bigint	Packets per second on the system network for write operations.
net_rb_rate	bigint	Bytes per second on the system network for read operations.
net_wb_rate	bigint	Bytes per second on the system network for write operations.

gpcc_table_info

The gpcc_table_info table stores a current snapshot of statistics for tables. There is one row for each table and partition.

Column	Type	Description
ctime	timestamp without time zone	Time the row was created.
dbid	oid	Object ID of the database.
relid	oid	Object ID of the table.
paroid	oid	Object ID of a partition.
reltablespace	oid	Object ID of the table’s tablespace.
seq_scan	bigint	Number of sequential scans initiated on this table.
idx_scan	bigint	Number of index scans initiated on this table.
n_tup_ins	bigint	Number of rows inserted.
n_tup_del	bigint	Number of rows deleted.
n_tup_upd	bigint	Number of rows updated (includes hot updated rows).
n_tup_hot_upd	bigint	Number of rows HOT updated (no separate index update required).
last_seq_scan	timestamp with time zone	Time of the last sequential scan on this table. An `ANALYZE` on a table can cause a sequential scan and update the count in the `seq_scan` column. However, the time of that scan is not saved in this column.
last_idx_scan	timestamp with time zone	Time of the last index scan on this table.
last_ins	timestamp with time zone	Time of the last insert on this table.
last_del	timestamp with time zone	Time of the last delete on this table.
last_upd	timestamp with time zone	Time of the last update on this table.
last_analyze	timestamp with time zone	Time of the last analyze on this table.
last_vacuum	timestamp with time zone	Time of the last vacuum of this table (excludes vacuum full).
last_vacuum_full	timestamp with time zone	Time of the last vacuum full or last vacuum full freeze of this table.
size	bigint	Combined size of the table’s files on all segments. Note that for partitioned tables, the sizes of individual partitions are not included.
row_cnt	real	Number of rows in the table.
children	integer	Number of partitions, including middle-level and child partitions.
schema	name	Name of the schema this table belongs to.
table_name	name	Name of the table.
owner	name	Name of the database role that owns this table.
relstorage	character(1)	Storage mode of this table. a=append-optimized, c=column-oriented, h =heap, v = virtual, x= external table.
relkind	character(1)	The type of object: r = heap or append-optimized table.
bloat	real	Calculated bloat for the table when last scanned.
scan_size	bigint	Size of table when last scanned.
unused	real	Unused.
skew	real	Calculated skew for the table when last scanned.
last_scan_rowcnt	real	Number of rows in the table when last scanned.
last_scan_ts	timestamp with time zone	Time the table was last scanned for Recommendations.
accuracy	real	Accuracy ratio calculated from query history when the table was last scanned.
last_accuracy_ts	timestamp with time zone	Time the accuracy ratio was last set.
distributed_by	text	Distribution policy for the table.
age_percent	float	Tracks information for calculating when a table should have vacuum freeze run on it. It stores what percentage of the `xid_warn_limit` GUC the table’s age is. For example, if a table’s age is 70% of `xid_warn_limit`, its `age_percent` is stored as 0.7.
last_vacuum_freeze	timestamp with time zone	Records the last time that one of vacuum full, vacuum freeze or vacuum full freeze was performed on the table.
last_size_ts	timestamp with time zone	Time when the current table’s size column was last updated.

gpcc_table_info_history

The gpcc_table_info_history table stores a daily snapshot of statistics about tables. Command Center saves statistics from the gpcc_table_info table at 3:00 a.m. each morning. There is one row for each table per day.

Column	Type	Description
ctime	timestamp(0) without time zone	Time the record was created.
dbid	oid	Object ID of the database.
relid	oid	Object ID of the table.
paroid	oid	Object ID of a partition.
reltablespace	oid	Object ID of the table’s tablespace.
seq_scan	bigint	Number of sequential scans initiated on this table.
idx_scan	bigint	Number of index scans initiated on this table.
n_tup_ins	bigint	Number of rows inserted.
n_tup_del	bigint	Number of rows deleted.
n_tup_upd	bigint	Number of rows updated (includes hot updated rows).
n_tup_hot_upd	bigint	Number of rows HOT updated (no separate index update required).
last_seq_scan	timestamp with time zone	Time of the last sequential scan on this table.
last_idx_scan	timestamp with time zone	Time of the last index scan on this table.
last_ins	timestamp with time zone	Time of the last insert on this table.
last_del	timestamp with time zone	Time of the last delete on this table.
last_upd	timestamp with time zone	Time of the last update on this table.
last_analyze	timestamp with time zone	Time of the last analyze on this table.
last_vacuum	timestamp with time zone	Time of the last vacuum of this table (excludes vacuum full).
last_vacuum_full	timestamp with time zone	Time of the last vacuum full of this table.
size	bigint	Combined size of the table’s files on all segments. Note that for partitioned tables, the sizes of individual partitions are not included.
row_cnt	real	Number of rows in the table.
children	integer	Number of partitions, including middle-level and child partitions.
schema	name	Name of the schema this table belongs to.
table_name	name	Name of the table.
owner	name	Name of the database role that owns this table.
relstorage	character(1)	Storage mode of this table. a=append-optimized, c=column-oriented, h = heap, v = virtual, x= external table.
relkind	character(1)	The type of object: r = heap or append-optimized table.
bloat	real	Calculated bloat for the table when last scanned.
scan_size	bigint	Size of the table when last scanned.
unused	real
skew	real	Calculated skew for the table when last scanned.
last_scan_rowcnt	real	Number of rows in the table when last scanned.
last_scan_ts	timestamp with time zone	Time the table was last scanned for Recommendations.
accuracy	real	Accuracy ratio calculated from query history when the table was last scanned.
last_accuracy_ts	timestamp with time zone	Time the accuracy ratio was last set.
distributed_by	text	Distribution policy for the table.
age_percent	float	Tracks information for calculating when a table should have `VACUUM FREEZE` run on it. It stores what percentage of the `xid_warn_limit` GUC the table’s age is. For example, if a table’s age is 70% of `xid_warn_limit`, its `age_percent` is stored as 0.7
last_vacuum_freeze	timestamp with time zone	Records the last time that one of VACUUM FULL, VACUUM FREEZE or VACUUM FULL FREEZE was performed on the table.
last_size_ts	timestamp with time zone	Time when the current table’s size column was last updated.

Real-Time Monitoring Tables

gpcc_queries_now

The gpcc_queries_now table saves real-time query metrics data.

Column	Type	Description
tmid	integer	A part of the query ID.
ssid	integer	Session ID of the query.
ccnt	integer	A part of the query ID.
username	character varying(64)	User name.
db	character varying(64)	Database.
status	character varying(16)	Query status.
rsgname	character varying(64)	Resource group name.
rsqname	character varying(64)	Resource queue name.
rsqpriority	character varying(16)	Resource queue priority.
tsubmit	timestamp(0) without time zone	Submit time.
tstart	timestamp(0) without time zone	Start time.
cpu_time	double precision	CPU time accumulated (unit: seconds).
cpu_master	double precision	CPU time on coordinator.
cpu_segment	double precision	CPU time on all segemnts.
cpu_master_percent	double precision	CPU percent on coordinator.
cpu_segment_percent	double precision	CPU percent on all segments.
spill_size	bigint	Spill file size (bytes)
memory	bigint	Memory (KB)
disk_read_rate	double precision	Current disk read rate (KB/s).
disk_write_rate	double precision	Current disk write rate (KB/s).
disk_read_bytes	bigint	Current disk read bytes (bytes).
disk_write_bytes	bigint	Current disk write bytes (bytes).
skew_cpu	double precision	CPU skew.
plan_gen	character varying(16)	Plan generator for the query.
cost	double precision	Query cost.
query_text	text	Query text.

Workload Management Tables

gpcc_wlm_rule

The gpcc_wlm_rule table stores the definition of workload management rules (both assignment rules and workload rules).

Column	Type	Description
serial_number	integer	Serial number of the rule.
rule_id	integer	Unique ID of the rule shown in the Command Center interface.
rsgname	name	Resource group name used to assign rule.
role	name	Database role name used to assign rule.
query_tag	text	Query tag string used to assign rule.
dest_rsg	name	Destination resource group name for moving a query.
cpu_time	integer	Maximum CPU time consumed by the query.
running_time	integer	Maximum running time for the query.
disk_io_mb	integer	Maximum total disk I/O used by the query.
planner_cost	float8	Maximum Postgres Planner cost for the query.
orca_cost	float8	Maximum GPORCA optimizer ost for the query.
slice_num	integer	Maximum number of slices assigned for executing the query.
action	integer	Action to performed when conditions are met.
active	boolean	Whether the rule is active or inactive.
ctime	timestamp without time zone	Time the rule was created.
etime	timestamp without time zone	Time that the rule was deleted.
idle session	json	JSON string containing parameters for idle session kill rules.
spill_file_mb	int	Spill file size, in MB.
cpuskew_percent	int	If the CPU skew percent of a query is equal to or higher than `cpuskew_percent` during `cpuskew_duration_sec` seconds, Command Center executes the rule.
cpuskew_duration_sec	int	If the CPU skew percent of a query is equal to or higher than `cpuskew_percent` during `cpuskew_duration_sec` seconds, Command Center executes the rule.
order_id	NOT NULL	The order id of the rule.

gpcc_wlm_log_history

The gpcc_wlm_log_history table stores the log history of workload rule actions that are triggered. Note that log history for assignment rules is not collected.

Column	Type	Description
ctime	timestamp without time zone	Time the log entry was created.
tstart	timestamp without time zone	Time the operation started.
tfinish	timestamp without time zone	Time the operation finished.
rule_serial_number	integer	Serial number of the rule.
rule_id	integer	Unique ID of the rule shown in the Command Center interface.
tmid	integer	A time identifier for the query. All records associated with a query will have the same `tmid`.
ssid	integer	Session id for the database connection. All records associated with the query will have the same `ssid`.
ccnt	integer	Command number within the session. All records associated with the query will have the same `ccnt`.
action	integer	Workload rule action that was initiated.
rsgname	name	Resource group name used to assign rule.
role	name	Database role name used to assign rule.
status	text	Indication of whether the action succeeded or failed.
fail_msg	text	Message associated with a failed action.

Data Loading Tables

Command Center uses the gpcc_gpss_job and gpcc_gpss_log tables to store information related to data loading.

gpcc_gpss_job

Column	Type	Description
job_id	varchar(32)	The job ID.
job_name	text	The job name.
job_type	character varying(20)	The job type. Currently, there is just one job type: s3.
creator	character varying(64)	The creator of the job.
yaml_content	text	YAML description of the job.
is_active	boolean	Indicates whether or not the job has been removed. Set to `true` when the job is submitted or resubmitted. `false` means the job has been removed.

gpcc_gpss_log

Column	Type	Description
ctime	timestamp without time zone.	The time when Command Center received the log.
gpss_time	timestamp(0)	Greenplum Streaming Server time returned from Greenplum Streaming Server gRPC call.
job_id	varchar(20)	The id of the data loading job.
job_name_at_the_time	text	The name of the job at the time Command Center received the log message.
log_type	gpmetrics.gpss_log_type	Specifies whether the log is received from GPSS gRPC streaming calls (marked as ‘event’) or from GPCC API calls (marked as ‘operation’).
ctx	integer	Context of event(job status changed to: UNSPECIFIED=0/STOPPED=1/RUNNING=2/ERROR=3/REMOVED=4/SUCCESS=5/SUBMITTED=6), or context of operation(start=0/stop=1/remove=2/submit=3)
operation_status	gpmetrics.gpss_operation_status	The status of operations (‘success’/‘fail’ for log_type=operation, ‘not_applicable’ for other types).
msg	text	Any messages returned from Greenplum Streaming Server.
log_id	integer	The id of the log. Increments automatically. When sorting logs, Command Center uses this value to break ties between logs with the same `ctime`, `log_type`, and `job_name_at_the_time`.

Other Tables

Command Center uses the gpcc_department and gpcc_role_department tables to store information about departments and roles.

gpcc_department

Column	Type	Description
dept_id	integer	Department ID.
dept_name	character varying(64)	Department name. Limited to 64 characters, can only contain [A-Z][a-z][0-9][_].
dept_status	smallint	1 or 0, where 1 indicates active and 0 indicates deleted.

### gpcc_role_department

Column	Type	Description
rolname	name	Role name.
dept_id	integer	Department ID.

User-Defined Functions

The gpmetrics schema includes a number of user-defined functions that provide users with information about queries and about the hosts on which the queries run. You can find the source code for all of these functions in the $GPCC_HOME/sqls/ directory.

In addition, the public schema includes a user-defined function to return recommendations about whether a table should be reindexed. See this section for details.

gpcc_delete_department

The gpcc_delete_department user-defined function deletes a department name from the gpperfmon database. It takes one parameter: the name of the department you want to delete. You may only delete department names that have no roles associated with them.

gpcc_queries_per_hour

The gpcc_queries_per_hour user-defined function returns the average and maximum number of total queries, running queries, queued queries, blocked queries, and finished queries per hour.

Here are some examples:

select * from gpmetrics.gpcc_queries_per_hour();
select * from gpmetrics.gpcc_queries_per_hour('2021-06-07')
select * from gpmetrics.gpcc_queries_per_hour('2021-06-09 00:00:00', '2021-06-09 13:20:05')

The first query returns results for the interval between 00:00:00 on the current date and the current time.

The second query returns results for the interval between 00:00:00 on ‘2021-06-07’ and the current time.

The third query returns results for the interval between 00:00:00 on 2021-06-09 and 13:20:05 on 2021-06-09.

gpcc_queries_per_user

The gpcc_queries_per_user user-defined function returns, for each user, the number of queries whose runtime is longer than the input interval, per hour, in the specified time range. The function takes an interval, a start time, and an end time. The default start time is 00:00:00 today. The default end time is the current time. The interval can not be omitted. It accepts PostgreSQL’s INTERVAL value.

Here are some examples:

select * from gpmetrics.gpcc_queries_per_user('5min','2018-05-01','2018-06-01');
select * from gpmetrics.gpcc_queries_per_user('5min');
select * from gpmetrics.gpcc_queries_per_user('5min','2018-05-01');

The first query returns, by user, all queries running longer than 5 minutes for the interval between 00:00:00 on 05/01/2018 and 23:59:59 on 06/01/2018.

The second query returns, by user, all queries running longer than 5 minutes in the interval between 00:00:00 today and the current time.

The third query returns, by user, all queries longer than 5 minutes in the interval between 00:00:00 on 2018-05-01 and the current time.

gpcc_queries_per_user_max_and_total_spill_size

The gpcc_queries_per_user__max_and_total_spill_size user-defined function returns, for each user, the total spill_size and maximum spill_size per query per hour. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.

Here are some examples:

select * from gpmetrics.gpcc_queries_per_user_max_and_total_spill_size();
select * from gpmetrics.gpcc_queries_per_user_max_and_total_spill_size('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_max_and_total_spill_size('2018-05-20');

The first query returns, by user, the total spill_size and maximum spill_size per query per hour for the interval between 00:00:00 on the current date and the current time.

The second query returns, by user, the total spill_size and maximum spill_size per query per hour for the interval between 00:00:00 on 05/20/2018 and 23:59:59 on 05/21/2018.

The third query returns, by user, the total spill_size and maximum spill_size per query per hour for the interval between 00:00:00 on ‘2018-05-20’ and the current time.

gpcc_queries_per_user_max_cpu

The gpcc_queries_per_user_max_cpu user-defined function returns, for each user, the query with the maximum segment and coordinator cpu usage per hour, along with details about the query. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.

Here are some examples:

select * from gpmetrics.gpcc_queries_per_user_max_cpu();
select * from gpmetrics.gpcc_queries_per_user_max_cpu('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_max_cpu('2018-05-20');

The first query returns, by user, the query with the maximum segment and coordinator cpu usage per hour for the interval between 00:00:00 on the current date and the current time.

The second query returns, by user, the query with the maximum segment and coordinator cpu usage per hour for the interval between 00:00:00 on 05/20/2018 and 23:59:59 on 05/21/2018.

The third query returns, by user, the query with the maximum segment and coordinator cpu usage per hour for the interval between 00:00:00 on ‘2018-05-20’ and the current time.

gpcc_queries_per_user_max_run_time

The gpcc_queries_per_user_max_run_time user-defined function returns, for each user, the longest running query per hour, along with details about the query. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.

Here are some examples:

select * from gpmetrics.gpcc_queries_per_user_max_run_time();
select * from gpmetrics.gpcc_queries_per_user_max_run_time('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_max_run_time('2018-05-20');

The first query returns, by user, the longest running query per hour for the interval between 00:00:00 on the current date and the current time.

The second query returns, by user, the longest running query per hour for the interval between 00:00:00 on 05/20/2018 and 23:59:59 on 05/21/2018.

The third query returns, by user, the longest running query per hour for the interval between 00:00:00 on ‘2018-05-20’ and the current time.

gpcc_queries_per_user_max_skew

The gpcc_queries_per_user_max_skew user-defined function returns, for each user, the query with the maximum amount of processing skew in the system (skew_cpu) per hour, along with details about the query. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.

Here are some examples:

select * from gpmetrics.gpcc_queries_per_user_max_skew();
select * from gpmetrics.gpcc_queries_per_user_max_skew('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_max_skew('2018-05-20');

The first query returns, by user, the query with the maximum amount of processing skew in the system for the interval between 00:00:00 on the current date and the current time.

The second query returns, by user, the query with the maximum amount of processing skew in the system for the interval between 00:00:00 on 05/20/2018 and 23:59:59 on 05/21/2018.

The third query returns, by user, the query with the maximum amount of processing skew in the system for the interval between 00:00:00 on ‘2018-05-20’ and the current time.

gpcc_queries_per_user_rows_out

The gpcc_queries_per_user_rows_out user-defined function returns, for each user, the query with the maximum rows_out per hour, along with details about the query. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.

Here are some examples:

select * from gpmetrics.gpcc_queries_per_user_rows_out();
select * from gpmetrics.gpcc_queries_per_user_rows_out('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_rows_out('2018-05-20');

The first query returns, by user, the query with the maximum rows_out per hour for the interval between 00:00:00 on the current date and the current time.

The second query returns, by user, the query with the maximum rows_out per hour for the interval between 00:00:00 05/20/2018 and 23:59:59 05/21/2018.

The third query returns, by user, the query with the maximum rows_out per hour for the interval between 00:00:00 on ‘2018-05-20’ and the current time.

gpcc_queries_top_10_cpu_per_resgroup

The gpcc_queries_top_10_cpu_per_resgroup user-defined function retrieves, by resource group, the top 10 queries utilizing the most CPU over the last 1 hour, 1 day, 1 week, and 1 month.

Here are some examples:

select * from gpcc_queries_top_10_cpu_per_resgroup();
select * from gpcc_queries_top_10_cpu_per_resgroup('1 day');
select * from gpcc_queries_top_10_cpu_per_resgroup('1 week');
select * from gpcc_queries_top_10_cpu_per_resgroup('1 month');

gpcc_queries_top_10_memory_per_resgroup

The gpcc_queries_top_10_memory_per_resgroup user-defined function retrieves, by resource group, the top 10 queries utilizing the most memory over the last 1 hour, 1 day, 1 week, and 1 month.

Here are some examples:

select * from gpcc_queries_top_10_memory_per_resgroup();
select * from gpcc_queries_top_10_memory_per_resgroup('1 day');
select * from gpcc_queries_top_10_memory_per_resgroup('1 week');
select * from gpcc_queries_top_10_memory_per_resgroup('1 month');

gpcc_recommendations

The gpcc_recommendations user-defined function returns a list of tables in need of a vacuum-related action such as VACUUM, VACUUM FULL, ANALYZE, SKEW, and VACUUM FREEZE, as well as a list of tables whose B-tree indices are in need of a REINDEX action. You may call this function directly from the command line, from within an SQL program, or using a psql utility.

Here are some examples:

gpmetrics.gpcc_recommendations('vacuum-full');
gpmetrics.gpcc_recommendations('vacuum-freeze');
gpmetrics.gpcc_recommendations('analyze');
gpmetrics.gpcc_recommendations('skew');
gpmetrics.gpcc_recommendations('reindex');

If you set custom values for the accuracy_threshold, bloat_unused_threshold, or bloat_dead_threshold properties in the gpcc.conf file, you should define any values you’ve set via the PGOPTIONS option when calling this user defined function, as in this example: PGOPTIONS="-c accuracy.threshold=0.2 -c bloat.vacuum.full.threshold=0.3 -c bloat.threshold=0.6" psql -d gpperfmon -c "select * from gpmetrics.gpcc_recommendations('vacuum');

Note
gpcc_recommendations(reindex) only reports tables that need reindexing for tables in the pg_catalog schema.

gpcc_system_per_hour

The gpcc_system_per_hour user-defined function returns a variety of system information, including: the average cpu_user and cpu_sys for all hosts, aggregated by hour; the average load0, load1, and load2 per hour for all hosts, aggregated by hour; and the average disk write and disk read bytes for all hosts, aggregated by hour. The function takes a start time and and time. The default start time is 00:00:00 today. The default end time is the current time.

This example returns the system information summarized in the previous paragraph for the interval between 00:00:00 08/30/2021 and 23:59:59 08/31/2021.

gpperfmon=# select * from gpmetrics.gpcc_system_per_hour('2021-08-30','2021-08-31');

gpcc_update_department

The gpcc_update_department user-defined function renames a department name. It takes two parameters: from_name – which specifies the old name – and to_name – which specifies the new name. Once it has run, all the roles associated with the old name will now be associated with the new name.

gpcc_reindex_check

Note
This user-defined function is located in the public schema of your database, rather than in the gpmetrics schema.

The gpcc_reindex_check user-defined function reports whether the B-tree indices of the table whose name and schema are passed in should be reindexed.

In order to use this function, you must first create it in your own database, using the following commands.

cd $GPCC_HOME/sqls/aux_udf/
psql -f gpcc_reindex_check.sql <dbname>

The above commands create the function in the public schema in the database specified in .

Afterwards, run the ANALYZE command on tables as necessary to generate statistics. You can then determine if reindexing is needed on a specific table by executing the function:

psql> select gpcc_reindex_check('schema_name','table_name');