Greenplum Command Center creates the gpmetrics
schema in the Vmware Greenplum gpperfmon to save alert rules and logs, and historical metrics collected by the VMware Greenplum metrics collection system. The gpmetrics
schema contains the following tables and user-defined functions:
Tables
gpcc_alert_rule
— saves alert rules configured on the Command Center Admin> Alerts page.gpcc_alert_history
— records an event when an alert rule is triggered.gpcc_database_history
— saves summary query activity information.gpcc_department
— saves department information.gpcc_disk_history
— saves disk usage statistics for each VMware Greenplum host file system.gpcc_export_log
— saves the log notifications from every “EXPORT ALL” user action.gpcc_gpss_job
— saves information about data loading jobs.gpcc_gpss_log
— saves logging-related information about data loading jobs.gpcc_index_info
— saves information about table indexes, particularly their bloat size.gpcc_pg_log_history
— saves history from the VMware Greenplum pg_log
log file.gpcc_plannode_history
— saves plan node execution statistics for completed queries.gpcc_queries_history
— saves execution metrics for completed VMware Greenplum queries.gpcc_queries_now
— saves real-time query metrics data.gpcc_role_department
— saves information about roles within a department.gpcc_resgroup_history
— saves the history of the resource consumption of each resource group on each segment.gpcc_scan_history
— saves history for Recommendations scans.gpcc_schedule
— saves schedule for Recommendations scans.gpcc_system_history
— saves system metrics sampled from VMware Greenplum segments hosts.gpcc_table_info
— saves current statistics and size information for tables.gpcc_table_info_history
— saves daily statistics and size information for tables.gpcc_wlm_rule
— saves workload management rules.gpcc_wlm_log_history
— saves the log history of workload management rule actions.User-Defined Functions
gpcc_delete_department
— deletes a department.gpcc_queries_per_hour
— returns a variety of details about query activity per hour.gpcc_queries_per_user
— returns, for each user, the number of queries whose runtime is longer than the input interval, per hour, in the specified time range.gpcc_queries_per_user_max_and_total_spill_size
— returns, for each user, the total spill_size
and maximum spill_size
per query per hour.gpcc_queries_per_user_max_cpu
— returns, for each user, the query with the maximum segment and coordinator cpu usage per hour, along with details about the query.gpcc_queries_per_user_max_run_time
— returns, for each user, the longest running query per hour, along with details about the query.gpcc_queries_per_user_max_skew
— returns, for each user, the query with the maximum amount of processing skew in the system (skew_cpu
) per hour, along with details about the query.gpcc_queries_per_user_rows_out
— returns, for each user, the query with the maximum rows_out
per hour, along with details about the query.gpcc_queries_top_10_cpu_per_resgroup
— returns by resource group, the top ten queries utilizing the most CPU.gpcc_queries_top_10_memory_per_resgroup
— returns by resource group, the top ten queries utilizing the most memory.gpcc_recommendations
— returns a list of tables in need of a vacuum-related action such as VACUUM
, VACUUM FULL
, ANALYZE
, and VACUUM FREEZE
, as well as tables in need of the REINDEX
action.gpcc_system_per_hour
— returns a variety of system information.gpcc_update_department
— renames a department.gpcc_reindex_check
— reports whether a table’s B-tree indices needs reindexing.If you set the schema search path to include the gpmetrics
schema, you do not have to qualify table and user-defined function names with the gpmetrics
schema name. To set the default search path for the gpperfmon database enter this SQL command.
=# ALTER DATABASE gpperfmon SET search_path TO public,gpmetrics;
You must exit the current session and start a new session for the new search path to take effect.
Command Center uses the gpcc_alert_rule
and gpcc_alert_log
tables to store the alert rules you set up in the user interface and to log messages when the alert rules are triggered.
Note: Deprecated in Command Center 6.4/4.12.
The gpcc_alert_rule
table records the alert rules configured in the Command Center user interface. It has the columns shown in the following table.
Column | Type | Description |
---|---|---|
rule_id | integer | Unique id for the rule. |
rule_type | integer | Reserved for future use. |
rule_description | character varying(512) | Text of the rule. |
rule_config | json | JSON string containing parameters for user-specified values. |
ctime | timestamp(0) without time zone | Time the rule was created. |
etime | timestamp(0) without time zone | Time the rule became inactive, or null if rule is active. |
The gpcc_alert_rule
table keeps a history of alert rule configurations. When a rule becomes active, a new row is inserted and the ctime
timestamp column is set to the current time; the etime
timestamp is null, indicating that the rule is still active. When a rule is either disabled or superceded by a new rule, the etime
timestamp column is set to the current time. Thus, the set of currently active rules is all rows where the etime
column is null. A row that has timestamps in both ctime
and etime
columns is an historical record of the period of time during which the rule was active.
The rule_id
column, a unique integer, is the distribution key for the table and is used to identify a single alert rule configuration. This column can be joined with the rule_id
column in the gpcc_alert_log
table to identify the rule that triggered each recorded alert event.
The rule_description
column contains a string that describes the event that matches the rule. It is the text displayed in the Command Center UI for the rule, with user-specified values inserted.
The rule_config
column contains a JSON string with parameters for the values entered for each of the rule’s fields in the Command Center UI.
The gpcc_alert_history
table (wasgpcc_alert_log
before Command Center 6.4) has the following columns:
Column | Type | Description |
---|---|---|
id | integer | Unique ID for the alert. |
rule_id | integer | The ID of the rule that triggered this alert. |
transaction_time | timestamp(0) without time zone | Time the alert was raised. |
content | json | Contains parameters specifying values that triggered the alert. |
config | json | Stores alert history from emails or notifications. |
Where:
id
column, a unique integer, is the distribution key for the table.transaction_time
column is set to the current time when a row is created.rule_id
column can be joined with the rule_id
column in the gpcc_alert_rule
table to access details of the rule that triggered the alert.content
column contains a JSON string with parameters specifying details about the event that triggered the alert. The JSON parameters vary with the type of the alert.config
column stores the alert history received from the notification center or from emails.The gpcc_alert_history
table is an append-only, column-oriented table, partitioned by month on the transaction_time
column. Command Center creates new partitions as needed and removes partitions over 12 months old.
A row is added to the gpcc_alert_history
table whenever an alert rule is matched.
The gpmetrics
query history saves information collected by the Greenplum Database metrics collection system and forwarded to Greenplum Command Center.
The distribution key for each table is a ctime
timestamp column, which is the time when the row is added to the database. The tables are partitioned by year and month, except for gpcc_queries_history
and gpcc_plannode_history
which are partitioned by day for optimal performance. Greenplum Command Center creates new partitions automatically as needed.
The history tables use append-optimized, column-oriented storage.
Command Center only saves queries with runtimes greater than the value of the min_query_time
configuration parameter, found in the $HOME/gpmetrics/gpcc.conf
configuration file on the host executing Command Center. The default, 0, saves all queries in the history table. This parameter can be configured on the Command Center Admin> Settings page.
The gpcc_database_history
table saves summary query activity metrics collected by the VMware Greenplum metrics collector. This data can be used to review the VMware Greenplum query load over time.
Column | Type | Description |
---|---|---|
ctime | timestamp(0) without time zone | Time the record was created. |
queries_total | integer | Total number of queries running and queued to run. |
queries_running | integer | Number of queries currently running. |
queries_queued | integer | Number of queries queued, but not yet running. |
queries_blocked | integer | The number of queries started, but blocked by other transactions. |
queries_finished | integer | The number of queries that completed since the previous sampling interval. |
The gpcc_disk_history
table saves historical disk usage statistics for each VMware Greenplum segment host file system.
Column | Type | Description |
---|---|---|
ctime | timestamp(0) without time zone | Time the row was created. |
hostname | character varying(64) | Name of the segment host. |
filesystem | text | Path to the segment’s data directory. |
total_bytes | bigint | Total size of the file system storage in bytes. |
bytes_used | bigint | Number of storage bytes in |
bytes_available | bigint | Number of storage bytes available. |
The gpcc_export_log
table saves the log notifications from every “EXPORT ALL” user action. Whenever the user exports search results from the History or the Table Browser pages, this table gets updated.
Column | Type | Modifiers |
---|---|---|
id | integer | not null default nextval(‘gpcc_export_log_id_seq’::regclass) |
ctime | timestamp(0) with time zone | not null default now() |
source | character varying(64) | not null |
filename | character varying(64) | not null |
role | name | not null |
req_params | json | not null |
etime | timestamp(0) with time zone | |
status | character varying(64) | |
fail_msg | text |
Command Center uses the gpcc_index_info
table to store information about table indices’ size, particularly their bloat size. This information is used to report when reindexing is recommended for a table.
Column | Type | Description |
---|---|---|
dbid | OID | The id of the table’s database; this value is related to the pg_database OID. |
database | varchar(64) | The table’s database. |
schema | varchar(64) | The table’s schema. |
table_oid | OID | The table’s OID. |
relation_name | varchar(64) | The table’s name. |
index_oid | OID | The OID of the table’s index. |
index_name | varchar(64) | The name of the table’s index. |
real_size | bigint | The actual size of the index. |
extra_size | bigint | The estimated extra size not used or needed by the index. This extra size is composed of the fillfactor, bloat, and alignment padding spaces. |
fillfactor | int | The fillfactor of the index. |
index_bloat_space | bigint | The estimated size of the bloat without the extra space set aside for the fillfactor. |
index_bloat_rate | bigint | The estimated percentage of the real size used by the index bloat space. |
is_na | boolean | If true , then the index bloat recorded by that row is “not accurate” and that index will not be included in the REINDEX recommendation. |
latest_scan_ts | timestamp with timezone | The time of the last scan. |
last_reindexed | timestamp with timezone | The time of the last reindexing. |
The gpcc_pg_log_history
table stores pg_log
errors and warnings from the Greenplum Server log files.
Column | Type | Description |
---|---|---|
logtime | timestamp without time zone | Timestamp for this log. |
loguser | text | Name of the role executing the query. |
logdatabase | text | The database accessed. |
logpid | text | Process id. |
logthread | text | Thread number. |
loghost | text | Host name or IP address of the host. |
logport | text | Port number. |
logsessiontime | timestamp without time zone | Session timestamp. |
logtransaction | integer | Transaction id. |
logsession | text | Session id. |
logcmdcount | text | Command count. |
logsegment | text | Segment number. |
logslice | text | Slice number. |
logdistxact | text | Distributed transation id. |
loglocalxact | text | Location transacton id. |
logsubxact | text | Subtransaction id. |
logseverity | text | Log severity. |
logstate | text | SQL State code associated with this log message. |
logmessage | text | Log or error message text. |
logdetail | text | Detail message text associated with an error message. |
loghint | text | Hint message text associated with an error message. |
logquery | text | Internally-generated query text. |
logquerypos | integer | Index into the internally-generated query text. |
logcontext | text | Context in which this message gets generated. |
logdebug | text | Query string with full detail for debugging. |
logcursorpos | integer | Cursor index into the query string. |
logfunction | text | Function in which this message is generated. |
logfile | text | Log file in which this message is generated. |
logline | integer | Line in the log file in which this message is generated. |
logstack | text | Full text of the stack trace associated with this message. |
The gpcc_plannode_history
table saves detailed metrics for each operation (node) in a completed query plan. Each row contains metrics for one operation that executed on one VMware Greenplum segment. This information allows reconstructing the plan and execution metrics for a completed query.
Plan node history is only saved for queries that execute for 10 seconds or more.
Column | Type | Description |
---|---|---|
ctime | timestamp(0) without time zone | Time the row was created. |
tmid | integer | A time identifier for the query. All records associated with a query will have the same tmid . |
ssid | integer | Session id for the database connection. All records associated with the query will have the same ssid . |
ccnt | integer | Command number within the session. All records associated with the query will have the same ccnt . |
segid | integer | Id (dbid ) of the segment for this plan node. |
procid | integer | The postgres process ID for this plan node. |
sliceid | integer | DEPRECATED. I̶d̶ o̶f̶ t̶h̶e̶ s̶l̶i̶c̶e̶ t̶h̶e̶ o̶p̶e̶r̶a̶t̶i̶o̶n̶ b̶e̶l̶o̶n̶g̶s̶ t̶o̶.̶ O̶p̶e̶r̶a̶t̶i̶o̶n̶s̶ t̶h̶a̶t̶ b̶e̶l̶o̶n̶g̶ t̶o̶ t̶h̶e̶ s̶a̶m̶e̶ s̶l̶i̶c̶e̶ e̶x̶e̶c̶u̶t̶e̶ i̶n̶ p̶a̶r̶a̶l̶l̶e̶l̶.̶ |
nodeid | integer | The query plan node ID for this operation. |
parent_nodeid | integer | The parent query plan node ID from the query plan. |
node_type | character varying(64) | Name of the operation type. |
tinit | timestamp(6) without time zone | Time the operation was initialized. |
tstart | timestamp(6) without time zone | Time the operation started. |
tfinish | timestamp(6) without time zone | Time the operation finished. |
status | character varying(16) | Status of the operation: initialize , executing , or finished . |
planrows | double precision | The number of output rows estimated for the operation. |
planwidth | integer | Width of output rows estimated for the operation. |
start_cost | double precision | Number of page reads expected to produce first output row. |
total_cost | double precision | Number of page reads expected to produce all output rows. |
tuple_count | bigint | |
ntuples | bigint | |
nloops | bigint | |
first_tuple | timestamp(6) without time zone | Time the operation produced the first output row. |
rel_oid | oid | Object ID of the output rows produced by the operation. |
relation_name | character varying(64) | Name of the table this operation processed, if any. |
index_name | character varying(64) | Name of the index used by this operation, if any. |
alias_name | character varying(64) | Alias for the relation declared in the SQL command. |
node_seq | integer | Node sequence |
condition | text | Condition expression used for a filter or join operation. |
The gpcc_queries_history
table saves metrics for completed queries.
Column | Type | Description |
---|---|---|
access_tables_info | integer | The table OID of the query accessed. |
ctime | timestamp(0) without time zone | Time the row was created. |
tmid | integer | A time identifier for the query. All records associated with the query will have the same tmid . |
ssid | integer | Session id for the database connection. All records associated with the query will have the same ssid . |
ccnt | integer | Command number within this session. All records associated with the query will have the same ccnt . |
username | character varying(64) | Role name that issued the query. |
db | character varying(64) | Name of the database queried. |
cost | double precision | Estimated cost to execute query, computed by the legacy planner or GPORCA. |
tsubmit | timestamp(6) without time zone | Time the query was submitted. |
tstart | timestamp(6) without time zone | Time the query was started. |
tfinish | timestamp(6) without time zone | Time the query finished. |
status | character varying(16) | Status of the query – abort , error , or done . |
rows_out | bigint | Number of rows returned by the query. |
error_msg | text | Error message, if the query failed. |
plan_gen | character varying(16) | PLANNER if query plan was generated by the legacy planner; OPTIMIZER if the plan was generated by GPORCA, the Greenplum query optimizer. |
query_hash | character varying(64) | Hash code generated from the text of the query. |
query_text | text | Complete text of the query. Some queries may be reformatted before storing in the history database. |
application_name | character varying(64) | Name of the client application that established the database connection. |
rsqname | character varying(64) | If the gp_resource_manager configuration parameter is queue , the name of the resource queue managing the query. |
rsgname | character varying(64) | If the gp_resource_manager configuration parameter is group , the name of the resource group managing the query. |
cpu_master | bigint | Total CPU usage for this query on the VMware Greenplum coordinator instance. |
cpu_segs | bigint | Total CPU usage for this query across all segments, measured in seconds. This is the sum of the CPU usage values taken from all active primary segments in the database array. |
cpu_master_percent | double precision | Average CPU percent usage on the coordinator host during execution of this query. |
cpu_segs_percent | double precision | Average CPU percent usage on the segment hosts during the execution of this query. |
skew_cpu | double precision | Displays the amount of processing skew in the system for this query. Processing/CPU skew occurs when one segment performs a disproportionate amount of processing for a query. The skew is calculated from total CPU seconds used on all segments during the execution of the query. |
skew_rows | double precision | Displays the amount of row skew in the system. Row skew occurs when one segment produces a disproportionate number of rows for a query. |
memory | bigint | Total size of memory, in kilobytes, used by all segments to execute this query. |
disk_read_bytes | bigint | Number of bytes read from disk. |
disk_write_bytes | bigint | Number of bytes written to disk. |
spill_size | bigint | Total size, in bytes, of spill files used by all segments to execute this query. |
rqpriority | character varying(16) | Priority setting for the resource queue managing this query. Blank if resource group management is enabled. |
query_tag | text | A key-value pair describing a query. |
slices_metrics | JSON | The cpu/memory/disk metrics for a slice. |
peak_memory | bigint | Maximum memory usage across all segments during the execution of the query, measured in KB. |
node_sliceid | json | A map of node ID to slice ID. |
lock_seconds | seconds | The number of seconds the query has been blocked. |
The gpcc_resgroup_history
table saves the history of the resource consumption of each resource group on each segment.
Column | Type | Description |
---|---|---|
ctime | timestamp(0) without time zone | Time the row was created. |
rsgname | name | Resource group name. |
host_name | text | The host name of the segment. |
groupid | oid | Object ID of the resource group. |
concurrency_limit | integer | The concurrency (CONCURRENCY ) value specified for the resource group. |
num_running | integer | The number of transactions currently executing in the resource group. |
num_queueing | integer | The number of currently queued transactions for the resource group. |
cpu_max_percent | integer | The CPU limit (CPU_MAX_PERCENT value) configured for the resource group, or -1. |
cpu_weight | integer | The scheduling priority (CPU_WEIGHT value) configured for the resource group. |
memory_limit | integer | The memory limit (MEMORY_LIMIT value) configured for the resource group, in MB. |
cpu_usage | numeric | The percentage of CPU resources used on the segment host. |
memory_usage | numeric | The memory used on the segment host, in MB. |
Column | Type | Description |
---|---|---|
ctime | timestamp(0) without time zone | Time the row was created. |
scan_id | integer | Scan id for this scan. See schedule_id column in the gpcc_schedule . |
status | integer | Scan completion status. 0=Running, 1=Success, 2=Failed, 3=Cancelled, 4=Expired. |
tables_scanned | bigint | The number of tables that were scanned. |
start_dt | timestamp(0) without time zone | Time the scan began. |
end_dt | timestamp(0) without time zone | Time the scan ended. |
The gpcc_schedule
table contains details for scheduled events, such as the table scan for Recommendations. Each row describes a schedule in a JSON value. The ctime
and etime
columns together specify the period of time during which the schedule is enabled.
Column | Type | Description |
---|---|---|
schedule_id | integer | Id number for this schedule. |
description | character varying(512) | Description of schedule. |
schedule_config | json | JSON string with schedule details. |
ctime | timestamp(0) without time zone | Time the schedule was added and enabled. |
etime | timestamp(0) without time zone | Time the schedule was disabled. |
The gpcc_system_history
table saves historical system metrics for each VMware Greenplum host, including the coordinator, standby coordinator, and segment hosts. The metrics include information about memory, CPU, disk, and network utilitization.
Column | Type | Description |
---|---|---|
cpu_iowait | double precision | The percentage of CPU used waiting on IO requests. |
ctime | timestamp(0) without time zone | Time the row was created. |
hostname | character varying(64) | Segment or coordinator hostname associated with these system metrics. |
mem_total | bigint | Total system memory in Bytes for this host. |
mem_used | bigint | System memory used, in Bytes, for this host. |
mem_cached | bigint | Actual memory used, in Bytes, for this host (not including the memory reserved for cache and buffers). |
mem_buffered | bigint | Free actual memory, in Bytes, for this host (not including the memory reserved for cache and buffers). |
swap_total | bigint | Total swap space in Bytes for this host. |
swap_used | bigint | Swap space used, in Bytes, for this host. |
swap_page_in | bigint | Number of swap pages in. |
swap_page_out | bigint | Number of swap pages out. |
cpu_user | double precision | Percentage of time CPU processes execute in user mode. |
cpu_sys | double precision | Percentage of time CPU processes execute in system (kernel) mode. |
cpu_idle | double precision | Percentage idle CPU. |
load0 | double precision | CPU one-minute load average. |
load1 | double precision | CPU five-minute load average. |
load2 | double precision | CPU fifteen-minute load average. |
quantum | seconds | Interval between metrics collections. |
disk_ro_rate | bigint | Disk read operations per second. |
disk_wo_rate | bigint | Disk write operations per second. |
disk_rb_rate | bigint | Bytes per second for disk read operations. |
disk_wb_rate | bigint | Bytes per second for disk write operations. |
min_free_kbytes | kilobyte | The minimum amount of free memory that the kernel’s memory manager keeps on a host machine. |
net_rp_rate | bigint | Packets per second on the system network for read operations. |
net_wp_rate | bigint | Packets per second on the system network for write operations. |
net_rb_rate | bigint | Bytes per second on the system network for read operations. |
net_wb_rate | bigint | Bytes per second on the system network for write operations. |
The gpcc_table_info
table stores a current snapshot of statistics for tables. There is one row for each table and partition.
Column | Type | Description |
---|---|---|
ctime | timestamp without time zone | Time the row was created. |
dbid | oid | Object ID of the database. |
relid | oid | Object ID of the table. |
paroid | oid | Object ID of a partition. |
reltablespace | oid | Object ID of the table’s tablespace. |
seq_scan | bigint | Number of sequential scans initiated on this table. |
idx_scan | bigint | Number of index scans initiated on this table. |
n_tup_ins | bigint | Number of rows inserted. |
n_tup_del | bigint | Number of rows deleted. |
n_tup_upd | bigint | Number of rows updated (includes hot updated rows). |
n_tup_hot_upd | bigint | Number of rows HOT updated (no separate index update required). |
last_seq_scan | timestamp with time zone | Time of the last sequential scan on this table. An ANALYZE on a table can cause a sequential scan and update the count in the seq_scan column. However, the time of that scan is not saved in this column. |
last_idx_scan | timestamp with time zone | Time of the last index scan on this table. |
last_ins | timestamp with time zone | Time of the last insert on this table. |
last_del | timestamp with time zone | Time of the last delete on this table. |
last_upd | timestamp with time zone | Time of the last update on this table. |
last_analyze | timestamp with time zone | Time of the last analyze on this table. |
last_vacuum | timestamp with time zone | Time of the last vacuum of this table (excludes vacuum full). |
last_vacuum_full | timestamp with time zone | Time of the last vacuum full or last vacuum full freeze of this table. |
size | bigint | Combined size of the table’s files on all segments. Note that for partitioned tables, the sizes of individual partitions are not included. |
row_cnt | real | Number of rows in the table. |
children | integer | Number of partitions, including middle-level and child partitions. |
schema | name | Name of the schema this table belongs to. |
table_name | name | Name of the table. |
owner | name | Name of the database role that owns this table. |
relstorage | character(1) | Storage mode of this table. a=append-optimized, c=column-oriented, h =heap, v = virtual, x= external table. |
relkind | character(1) | The type of object: r = heap or append-optimized table. |
bloat | real | Calculated bloat for the table when last scanned. |
scan_size | bigint | Size of table when last scanned. |
unused | real | Unused. |
skew | real | Calculated skew for the table when last scanned. |
last_scan_rowcnt | real | Number of rows in the table when last scanned. |
last_scan_ts | timestamp with time zone | Time the table was last scanned for Recommendations. |
accuracy | real | Accuracy ratio calculated from query history when the table was last scanned. |
last_accuracy_ts | timestamp with time zone | Time the accuracy ratio was last set. |
distributed_by | text | Distribution policy for the table. |
age_percent | float | Tracks information for calculating when a table should have vacuum freeze run on it. It stores what percentage of the xid_warn_limit GUC the table’s age is. For example, if a table’s age is 70% of xid_warn_limit , its age_percent is stored as 0.7. |
last_vacuum_freeze | timestamp with time zone | Records the last time that one of vacuum full, vacuum freeze or vacuum full freeze was performed on the table. |
last_size_ts | timestamp with time zone | Time when the current table’s size column was last updated. |
The gpcc_table_info_history
table stores a daily snapshot of statistics about tables. Command Center saves statistics from the gpcc_table_info
table at 3:00 a.m. each morning. There is one row for each table per day.
Column | Type | Description |
---|---|---|
ctime | timestamp(0) without time zone | Time the record was created. |
dbid | oid | Object ID of the database. |
relid | oid | Object ID of the table. |
paroid | oid | Object ID of a partition. |
reltablespace | oid | Object ID of the table’s tablespace. |
seq_scan | bigint | Number of sequential scans initiated on this table. |
idx_scan | bigint | Number of index scans initiated on this table. |
n_tup_ins | bigint | Number of rows inserted. |
n_tup_del | bigint | Number of rows deleted. |
n_tup_upd | bigint | Number of rows updated (includes hot updated rows). |
n_tup_hot_upd | bigint | Number of rows HOT updated (no separate index update required). |
last_seq_scan | timestamp with time zone | Time of the last sequential scan on this table. |
last_idx_scan | timestamp with time zone | Time of the last index scan on this table. |
last_ins | timestamp with time zone | Time of the last insert on this table. |
last_del | timestamp with time zone | Time of the last delete on this table. |
last_upd | timestamp with time zone | Time of the last update on this table. |
last_analyze | timestamp with time zone | Time of the last analyze on this table. |
last_vacuum | timestamp with time zone | Time of the last vacuum of this table (excludes vacuum full). |
last_vacuum_full | timestamp with time zone | Time of the last vacuum full of this table. |
size | bigint | Combined size of the table’s files on all segments. Note that for partitioned tables, the sizes of individual partitions are not included. |
row_cnt | real | Number of rows in the table. |
children | integer | Number of partitions, including middle-level and child partitions. |
schema | name | Name of the schema this table belongs to. |
table_name | name | Name of the table. |
owner | name | Name of the database role that owns this table. |
relstorage | character(1) | Storage mode of this table. a=append-optimized, c=column-oriented, h = heap, v = virtual, x= external table. |
relkind | character(1) | The type of object: r = heap or append-optimized table. |
bloat | real | Calculated bloat for the table when last scanned. |
scan_size | bigint | Size of the table when last scanned. |
unused | real | |
skew | real | Calculated skew for the table when last scanned. |
last_scan_rowcnt | real | Number of rows in the table when last scanned. |
last_scan_ts | timestamp with time zone | Time the table was last scanned for Recommendations. |
accuracy | real | Accuracy ratio calculated from query history when the table was last scanned. |
last_accuracy_ts | timestamp with time zone | Time the accuracy ratio was last set. |
distributed_by | text | Distribution policy for the table. |
age_percent | float | Tracks information for calculating when a table should have VACUUM FREEZE run on it. It stores what percentage of the xid_warn_limit GUC the table’s age is. For example, if a table’s age is 70% of xid_warn_limit , its age_percent is stored as 0.7 |
last_vacuum_freeze | timestamp with time zone | Records the last time that one of VACUUM FULL, VACUUM FREEZE or VACUUM FULL FREEZE was performed on the table. |
last_size_ts | timestamp with time zone | Time when the current table’s size column was last updated. |
The gpcc_queries_now
table saves real-time query metrics data.
Column | Type | Description |
---|---|---|
tmid | integer | A part of the query ID. |
ssid | integer | Session ID of the query. |
ccnt | integer | A part of the query ID. |
username | character varying(64) | User name. |
db | character varying(64) | Database. |
status | character varying(16) | Query status. |
rsgname | character varying(64) | Resource group name. |
rsqname | character varying(64) | Resource queue name. |
rsqpriority | character varying(16) | Resource queue priority. |
tsubmit | timestamp(0) without time zone | Submit time. |
tstart | timestamp(0) without time zone | Start time. |
cpu_time | double precision | CPU time accumulated (unit: seconds). |
cpu_master | double precision | CPU time on coordinator. |
cpu_segment | double precision | CPU time on all segemnts. |
cpu_master_percent | double precision | CPU percent on coordinator. |
cpu_segment_percent | double precision | CPU percent on all segments. |
spill_size | bigint | Spill file size (bytes) |
memory | bigint | Memory (KB) |
disk_read_rate | double precision | Current disk read rate (KB/s). |
disk_write_rate | double precision | Current disk write rate (KB/s). |
disk_read_bytes | bigint | Current disk read bytes (bytes). |
disk_write_bytes | bigint | Current disk write bytes (bytes). |
skew_cpu | double precision | CPU skew. |
plan_gen | character varying(16) | Plan generator for the query. |
cost | double precision | Query cost. |
query_text | text | Query text. |
The gpcc_wlm_rule
table stores the definition of workload management rules (both assignment rules and workload rules).
Column | Type | Description |
---|---|---|
serial_number | integer | Serial number of the rule. |
rule_id | integer | Unique ID of the rule shown in the Command Center interface. |
rsgname | name | Resource group name used to assign rule. |
role | name | Database role name used to assign rule. |
query_tag | text | Query tag string used to assign rule. |
dest_rsg | name | Destination resource group name for moving a query. |
cpu_time | integer | Maximum CPU time consumed by the query. |
running_time | integer | Maximum running time for the query. |
disk_io_mb | integer | Maximum total disk I/O used by the query. |
planner_cost | float8 | Maximum Postgres Planner cost for the query. |
orca_cost | float8 | Maximum GPORCA optimizer ost for the query. |
slice_num | integer | Maximum number of slices assigned for executing the query. |
action | integer | Action to performed when conditions are met. |
active | boolean | Whether the rule is active or inactive. |
ctime | timestamp without time zone | Time the rule was created. |
etime | timestamp without time zone | Time that the rule was deleted. |
idle session | json | JSON string containing parameters for idle session kill rules. |
spill_file_mb | int | Spill file size, in MB. |
cpuskew_percent | int | If the CPU skew percent of a query is equal to or higher than cpuskew_percent during cpuskew_duration_sec seconds, Command Center executes the rule. |
cpuskew_duration_sec | int | If the CPU skew percent of a query is equal to or higher than cpuskew_percent during cpuskew_duration_sec seconds, Command Center executes the rule. |
order_id | NOT NULL | The order id of the rule. |
The gpcc_wlm_log_history
table stores the log history of workload rule actions that are triggered. Note that log history for assignment rules is not collected.
Column | Type | Description |
---|---|---|
ctime | timestamp without time zone | Time the log entry was created. |
tstart | timestamp without time zone | Time the operation started. |
tfinish | timestamp without time zone | Time the operation finished. |
rule_serial_number | integer | Serial number of the rule. |
rule_id | integer | Unique ID of the rule shown in the Command Center interface. |
tmid | integer | A time identifier for the query. All records associated with a query will have the same tmid . |
ssid | integer | Session id for the database connection. All records associated with the query will have the same ssid . |
ccnt | integer | Command number within the session. All records associated with the query will have the same ccnt . |
action | integer | Workload rule action that was initiated. |
rsgname | name | Resource group name used to assign rule. |
role | name | Database role name used to assign rule. |
status | text | Indication of whether the action succeeded or failed. |
fail_msg | text | Message associated with a failed action. |
Command Center uses the gpcc_gpss_job
and gpcc_gpss_log
tables to store information related to data loading.
Column | Type | Description |
---|---|---|
job_id | varchar(32) | The job ID. |
job_name | text | The job name. |
job_type | character varying(20) | The job type. Currently, there is just one job type: s3. |
creator | character varying(64) | The creator of the job. |
yaml_content | text | YAML description of the job. |
is_active | boolean | Indicates whether or not the job has been removed. Set to true when the job is submitted or resubmitted. false means the job has been removed. |
Column | Type | Description |
---|---|---|
ctime | timestamp without time zone. | The time when Command Center received the log. |
gpss_time | timestamp(0) | Greenplum Streaming Server time returned from Greenplum Streaming Server gRPC call. |
job_id | varchar(20) | The id of the data loading job. |
job_name_at_the_time | text | The name of the job at the time Command Center received the log message. |
log_type | gpmetrics.gpss_log_type | Specifies whether the log is received from GPSS gRPC streaming calls (marked as ‘event’) or from GPCC API calls (marked as ‘operation’). |
ctx | integer | Context of event(job status changed to: UNSPECIFIED=0/STOPPED=1/RUNNING=2/ERROR=3/REMOVED=4/SUCCESS=5/SUBMITTED=6), or context of operation(start=0/stop=1/remove=2/submit=3) |
operation_status | gpmetrics.gpss_operation_status | The status of operations (‘success’/‘fail’ for log_type=operation, ‘not_applicable’ for other types). |
msg | text | Any messages returned from Greenplum Streaming Server. |
log_id | integer | The id of the log. Increments automatically. When sorting logs, Command Center uses this value to break ties between logs with the same ctime , log_type , and job_name_at_the_time . |
Command Center uses the gpcc_department
and gpcc_role_department
tables to store information about departments and roles.
Column | Type | Description |
---|---|---|
dept_id | integer | Department ID. |
dept_name | character varying(64) | Department name. Limited to 64 characters, can only contain [A-Z][a-z][0-9][_]. |
dept_status | smallint | 1 or 0, where 1 indicates active and 0 indicates deleted. |
Column | Type | Description |
---|---|---|
rolname | name | Role name. |
dept_id | integer | Department ID. |
The gpmetrics
schema includes a number of user-defined functions that provide users with information about queries and about the hosts on which the queries run. You can find the source code for all of these functions in the $GPCC_HOME/sqls/
directory.
In addition, the public
schema includes a user-defined function to return recommendations about whether a table should be reindexed. See this section for details.
The gpcc_delete_department
user-defined function deletes a department name from the gpperfmon
database. It takes one parameter: the name of the department you want to delete. You may only delete department names that have no roles associated with them.
The gpcc_queries_per_hour
user-defined function returns the average and maximum number of total queries, running queries, queued queries, blocked queries, and finished queries per hour.
Here are some examples:
select * from gpmetrics.gpcc_queries_per_hour();
select * from gpmetrics.gpcc_queries_per_hour('2021-06-07')
select * from gpmetrics.gpcc_queries_per_hour('2021-06-09 00:00:00', '2021-06-09 13:20:05')
The first query returns results for the interval between 00:00:00 on the current date and the current time.
The second query returns results for the interval between 00:00:00 on ‘2021-06-07’ and the current time.
The third query returns results for the interval between 00:00:00 on 2021-06-09 and 13:20:05 on 2021-06-09.
The gpcc_queries_per_user
user-defined function returns, for each user, the number of queries whose runtime is longer than the input interval, per hour, in the specified time range. The function takes an interval, a start time, and an end time. The default start time is 00:00:00 today. The default end time is the current time. The interval can not be omitted. It accepts PostgreSQL’s INTERVAL
value.
Here are some examples:
select * from gpmetrics.gpcc_queries_per_user('5min','2018-05-01','2018-06-01');
select * from gpmetrics.gpcc_queries_per_user('5min');
select * from gpmetrics.gpcc_queries_per_user('5min','2018-05-01');
The first query returns, by user, all queries running longer than 5 minutes for the interval between 00:00:00 on 05/01/2018 and 23:59:59 on 06/01/2018.
The second query returns, by user, all queries running longer than 5 minutes in the interval between 00:00:00 today and the current time.
The third query returns, by user, all queries longer than 5 minutes in the interval between 00:00:00 on 2018-05-01 and the current time.
The gpcc_queries_per_user__max_and_total_spill_size
user-defined function returns, for each user, the total spill_size
and maximum spill_size
per query per hour. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.
Here are some examples:
select * from gpmetrics.gpcc_queries_per_user_max_and_total_spill_size();
select * from gpmetrics.gpcc_queries_per_user_max_and_total_spill_size('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_max_and_total_spill_size('2018-05-20');
The first query returns, by user, the total spill_size
and maximum spill_size
per query per hour for the interval between 00:00:00 on the current date and the current time.
The second query returns, by user, the total spill_size
and maximum spill_size
per query per hour for the interval between 00:00:00 on 05/20/2018 and 23:59:59 on 05/21/2018.
The third query returns, by user, the total spill_size
and maximum spill_size
per query per hour for the interval between 00:00:00 on ‘2018-05-20’ and the current time.
The gpcc_queries_per_user_max_cpu
user-defined function returns, for each user, the query with the maximum segment and coordinator cpu usage per hour, along with details about the query. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.
Here are some examples:
select * from gpmetrics.gpcc_queries_per_user_max_cpu();
select * from gpmetrics.gpcc_queries_per_user_max_cpu('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_max_cpu('2018-05-20');
The first query returns, by user, the query with the maximum segment and coordinator cpu usage per hour for the interval between 00:00:00 on the current date and the current time.
The second query returns, by user, the query with the maximum segment and coordinator cpu usage per hour for the interval between 00:00:00 on 05/20/2018 and 23:59:59 on 05/21/2018.
The third query returns, by user, the query with the maximum segment and coordinator cpu usage per hour for the interval between 00:00:00 on ‘2018-05-20’ and the current time.
The gpcc_queries_per_user_max_run_time
user-defined function returns, for each user, the longest running query per hour, along with details about the query. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.
Here are some examples:
select * from gpmetrics.gpcc_queries_per_user_max_run_time();
select * from gpmetrics.gpcc_queries_per_user_max_run_time('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_max_run_time('2018-05-20');
The first query returns, by user, the longest running query per hour for the interval between 00:00:00 on the current date and the current time.
The second query returns, by user, the longest running query per hour for the interval between 00:00:00 on 05/20/2018 and 23:59:59 on 05/21/2018.
The third query returns, by user, the longest running query per hour for the interval between 00:00:00 on ‘2018-05-20’ and the current time.
The gpcc_queries_per_user_max_skew
user-defined function returns, for each user, the query with the maximum amount of processing skew in the system (skew_cpu
) per hour, along with details about the query. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.
Here are some examples:
select * from gpmetrics.gpcc_queries_per_user_max_skew();
select * from gpmetrics.gpcc_queries_per_user_max_skew('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_max_skew('2018-05-20');
The first query returns, by user, the query with the maximum amount of processing skew in the system for the interval between 00:00:00 on the current date and the current time.
The second query returns, by user, the query with the maximum amount of processing skew in the system for the interval between 00:00:00 on 05/20/2018 and 23:59:59 on 05/21/2018.
The third query returns, by user, the query with the maximum amount of processing skew in the system for the interval between 00:00:00 on ‘2018-05-20’ and the current time.
The gpcc_queries_per_user_rows_out
user-defined function returns, for each user, the query with the maximum rows_out
per hour, along with details about the query. The function takes a start time and an end time. The default start time is 00:00:00 today. The default end time is the current time.
Here are some examples:
select * from gpmetrics.gpcc_queries_per_user_rows_out();
select * from gpmetrics.gpcc_queries_per_user_rows_out('2018-05-20','2018-05-21');
select * from gpmetrics.gpcc_queries_per_user_rows_out('2018-05-20');
The first query returns, by user, the query with the maximum rows_out
per hour for the interval between 00:00:00 on the current date and the current time.
The second query returns, by user, the query with the maximum rows_out
per hour for the interval between 00:00:00 05/20/2018 and 23:59:59 05/21/2018.
The third query returns, by user, the query with the maximum rows_out
per hour for the interval between 00:00:00 on ‘2018-05-20’ and the current time.
The gpcc_queries_top_10_cpu_per_resgroup
user-defined function retrieves, by resource group, the top 10 queries utilizing the most CPU over the last 1 hour, 1 day, 1 week, and 1 month.
Here are some examples:
select * from gpcc_queries_top_10_cpu_per_resgroup();
select * from gpcc_queries_top_10_cpu_per_resgroup('1 day');
select * from gpcc_queries_top_10_cpu_per_resgroup('1 week');
select * from gpcc_queries_top_10_cpu_per_resgroup('1 month');
The gpcc_queries_top_10_memory_per_resgroup
user-defined function retrieves, by resource group, the top 10 queries utilizing the most memory over the last 1 hour, 1 day, 1 week, and 1 month.
Here are some examples:
select * from gpcc_queries_top_10_memory_per_resgroup();
select * from gpcc_queries_top_10_memory_per_resgroup('1 day');
select * from gpcc_queries_top_10_memory_per_resgroup('1 week');
select * from gpcc_queries_top_10_memory_per_resgroup('1 month');
The gpcc_recommendations
user-defined function returns a list of tables in need of a vacuum-related action such as VACUUM
, VACUUM FULL
, ANALYZE
, SKEW
, and VACUUM FREEZE
, as well as a list of tables whose B-tree indices are in need of a REINDEX
action. You may call this function directly from the command line, from within an SQL program, or using a psql
utility.
Here are some examples:
gpmetrics.gpcc_recommendations('vacuum-full');
gpmetrics.gpcc_recommendations('vacuum-freeze');
gpmetrics.gpcc_recommendations('analyze');
gpmetrics.gpcc_recommendations('skew');
gpmetrics.gpcc_recommendations('reindex');
If you set custom values for the accuracy_threshold
, bloat_unused_threshold
, or bloat_dead_threshold
properties in the gpcc.conf
file, you should define any values you’ve set via the PGOPTIONS
option when calling this user defined function, as in this example: PGOPTIONS="-c accuracy.threshold=0.2 -c bloat.vacuum.full.threshold=0.3 -c bloat.threshold=0.6" psql -d gpperfmon -c "select * from gpmetrics.gpcc_recommendations('vacuum');
Note
gpcc_recommendations(reindex)
only reports tables that need reindexing for tables in thepg_catalog
schema.
The gpcc_system_per_hour
user-defined function returns a variety of system information, including: the average cpu_user
and cpu_sys
for all hosts, aggregated by hour; the average load0
, load1
, and load2
per hour for all hosts, aggregated by hour; and the average disk write and disk read bytes for all hosts, aggregated by hour. The function takes a start time and and time. The default start time is 00:00:00 today. The default end time is the current time.
This example returns the system information summarized in the previous paragraph for the interval between 00:00:00 08/30/2021 and 23:59:59 08/31/2021.
gpperfmon=# select * from gpmetrics.gpcc_system_per_hour('2021-08-30','2021-08-31');
The gpcc_update_department
user-defined function renames a department name. It takes two parameters: from_name
– which specifies the old name – and to_name
– which specifies the new name. Once it has run, all the roles associated with the old name will now be associated with the new name.
NoteThis user-defined function is located in the
public
schema of your database, rather than in thegpmetrics
schema.
The gpcc_reindex_check
user-defined function reports whether the B-tree indices of the table whose name and schema are passed in should be reindexed.
In order to use this function, you must first create it in your own database, using the following commands.
cd $GPCC_HOME/sqls/aux_udf/
psql -f gpcc_reindex_check.sql <dbname>
The above commands create the function in the public
schema in the database specified in
Afterwards, run the ANALYZE
command on tables as necessary to generate statistics. You can then determine if reindexing is needed on a specific table by executing the function:
psql> select gpcc_reindex_check('schema_name','table_name');