gpperfmon database is a dedicated database where data collection agents on Greenplum segment hosts save query and system statistics. The optional Greenplum Command Center management tool depends upon the
gpperfmon database for query history.
gpperfmon database is created using the
gpperfmon_install command-line utility. The utility creates the database and the
gpmon database role and enables the data collection agents on the master and segment hosts. See the
gpperfmon_install reference in the Greenplum Database Utility Guide for information about using the utility and configuring the data collection agents.
gpperfmon database consists of three sets of tables that capture query and system status information at different stages.
_nowtables store current system metrics such as active queries.
_tailtables are used to stage data before it is saved to the
_tailtables are for internal use only and not to be queried by users.
_historytables store historical metrics.
The data for
_tail tables are stored as text files on the master host file system, and are accessed in the
gpperfmon database via external tables. The
history tables are regular heap database tables in the
gpperfmon database. History is saved only for queries that run for a minimum number of seconds, 20 by default. You can set this threshold to another value by setting the
min_query_time parameter in the
$MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf configuration file. Setting the value to 0 saves history for all queries.
gpperfmon does not support SQL
ALTER queries are not recorded in the
gpperfmon query history tables.
history tables are partitioned by month. See History Table Partition Retention for information about removing old partitions.
The database contains the following categories of tables:
gpperfmon database also contains the following views:
history tables in the
gpperfmon database are partitioned by month. Partitions are automatically added in two month increments as needed.
partition_age parameter in the
$MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf file can be set to the maximum number of monthly partitions to keep. Partitions older than the specified value are removed automatically when new partitions are added.
The default value for
0, which means that administrators must manually remove unneeded partitions.
gp_gperfmon_enable server configuration parameter is set to true, the Greenplum Database syslogger writes alert messages to a
.csv file in the
The level of messages written to the log can be set to
panic by setting the
gpperfmon_log_alert_level server configuration parameter in
postgresql.conf. The default message level is
The directory where the log is written can be changed by setting the
log_location configuration variable in the
$MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf configuration file.
The syslogger rotates the alert log every 24 hours or when the current log file reaches or exceeds 1MB.
A rotated log file can exceed 1MB if a single error message contains a large SQL statement or a large stack trace. Also, the syslogger processes error messages in chunks, with a separate chunk for each logging process. The size of a chunk is OS-dependent; on Red Hat Enterprise Linux, for example, it is 4096 bytes. If many Greenplum Database sessions generate error messages at the same time, the log file can grow significantly before its size is checked and log rotation is triggered.
When Greenplum Database starts up with gpperfmon support enabled, it forks a
gpmmon agent process.
gpmmon then starts a
gpsmon agent process on the master host and every segment host in the Greenplum Database cluster. The Greenplum Database postmaster process monitors the
gpmmon process and restarts it if needed, and the
gpmmon process monitors and restarts
gpsmon processes as needed.
gpmmon process runs in a loop and at configurable intervals retrieves data accumulated by the
gpsmon processes, adds it to the data files for the
_tail external database tables, and then into the
_history regular heap database tables.
log_alert tables in the
gpperfmon database follow a different process, since alert messages are delivered by the Greenplum Database system logger instead of through
gpsmon. See Alert Log Processing and Log Rotation for more information.
Two configuration parameters in the
$MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf configuration file control how often
gpmmon activities are triggered:
quantumparameter is how frequently, in seconds,
gpmmonrequests data from the
gpsmonagents on the segment hosts and adds retrieved data to the
_tailexternal table data files. Valid values for the
quantumparameter are 10, 15, 20, 30, and 60. The default is 15.
harvest_intervalparameter is how frequently, in seconds, data in the
_tailtables is moved to the
harvest_intervalmust be at least 30. The default is 120.
gpperfmon_install management utility reference in the Greenplum Database Utility Guide for the complete list of gpperfmon configuration parameters.
The following steps describe the flow of data from Greenplum Database into the
gpperfmon database when gpperfmon support is enabled.
gp_gpperfmon_send_intervalserver configuration variable determines how frequently the database sends these messages. The default is every second.
gpsmonprocess on each host receives the UDP packets, consolidates and summarizes the data they contain, and adds additional host metrics, such as CPU and memory usage.
gpsmonprocesses continue to accumulate data until they receive a dump command from
gpsmonprocesses respond to a dump command by sending their accumulated status data and log alerts to a listening
gpmmonevent handler thread.
gpmmonevent handler saves the metrics to
.txtfiles in the
$MASTER_DATA_DIRECTORY/gpperfmon/datadirectory on the master host.
quantum interval (15 seconds by default),
gpmmon performs the following steps:
Sends a dump command to the
Gathers and converts the
.txt files saved in
the $MASTER_DATA_DIRECTORY/gpperfmon/data directory into
.dat external data files for the
_tail external tables in the
For example, disk space metrics are added to the
_diskspace_tail.dat delimited text files. These text files are accessed via the
_diskspace_tail tables in the
harvest_interval (120 seconds by default),
gpmmon performs the following steps for each
_tail file to a
Creates a new
Appends data from the
_stage file into the
Runs a SQL command to insert the data from the
_tail external table into the corresponding
For example, the contents of the
_database_tail external table is inserted into the
database_history regular (heap) table.
_tail file after its contents have been loaded into the database table.
Gathers all of the
gpdb-alert-*.csv files in the
$MASTER_DATA_DIRECTORY/gpperfmon/logs directory (except the most recent, which the syslogger has open and is writing to) into a single file,
alert_log_stage file into the
log_alert_history table in the
The following topics describe the contents of the tables in the
Parent topic: Greenplum Database Reference Guide