The gpperfmon
database is a dedicated database where data collection agents on Greenplum segment hosts save query and system statistics. The optional Greenplum Command Center management tool depends upon the gpperfmon
database for query history.
The gpperfmon
database is created using the gpperfmon_install
command-line utility. The utility creates the database and the gpmon
database role and enables the data collection agents on the master and segment hosts. See the gpperfmon_install
reference in the Greenplum Database Utility Guide for information about using the utility and configuring the data collection agents.
The gpperfmon
database consists of three sets of tables that capture query and system status information at different stages.
_now
tables store current system metrics such as active queries._tail
tables are used to stage data before it is saved to the _history
tables. The _tail
tables are for internal use only and not to be queried by users._history
tables store historical metrics.The data for _now
and _tail
tables are stored as text files on the master host file system, and are accessed in the gpperfmon
database via external tables. The history
tables are regular heap database tables in the gpperfmon
database. History is saved only for queries that run for a minimum number of seconds, 20 by default. You can set this threshold to another value by setting the min_query_time
parameter in the $MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf
configuration file. Setting the value to 0 saves history for all queries.
Note: gpperfmon
does not support SQL ALTER
commands. ALTER
queries are not recorded in the gpperfmon
query history tables.
The history
tables are partitioned by month. See History Table Partition Retention for information about removing old partitions.
The database contains the following categories of tables:
pg_log
.The gpperfmon
database also contains the following views:
system_history
and segment_history
tables.The history
tables in the gpperfmon
database are partitioned by month. Partitions are automatically added in two month increments as needed.
The partition_age
parameter in the $MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf
file can be set to the maximum number of monthly partitions to keep. Partitions older than the specified value are removed automatically when new partitions are added.
The default value for partition_age
is 0
, which means that administrators must manually remove unneeded partitions.
When the gp_gperfmon_enable
server configuration parameter is set to true, the Greenplum Database syslogger writes alert messages to a .csv
file in the $MASTER_DATA_DIRECTORY/gpperfmon/logs
directory.
The level of messages written to the log can be set to none
, warning
, error
, fatal
, or panic
by setting the gpperfmon_log_alert_level
server configuration parameter in postgresql.conf
. The default message level is warning
.
The directory where the log is written can be changed by setting the log_location
configuration variable in the $MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf
configuration file.
The syslogger rotates the alert log every 24 hours or when the current log file reaches or exceeds 1MB.
A rotated log file can exceed 1MB if a single error message contains a large SQL statement or a large stack trace. Also, the syslogger processes error messages in chunks, with a separate chunk for each logging process. The size of a chunk is OS-dependent; on Red Hat Enterprise Linux, for example, it is 4096 bytes. If many Greenplum Database sessions generate error messages at the same time, the log file can grow significantly before its size is checked and log rotation is triggered.
When Greenplum Database starts up with gpperfmon support enabled, it forks a gpmmon
agent process. gpmmon
then starts a gpsmon
agent process on the master host and every segment host in the Greenplum Database cluster. The Greenplum Database postmaster process monitors the gpmmon
process and restarts it if needed, and the gpmmon
process monitors and restarts gpsmon
processes as needed.
The gpmmon
process runs in a loop and at configurable intervals retrieves data accumulated by the gpsmon
processes, adds it to the data files for the _now
and _tail
external database tables, and then into the _history
regular heap database tables.
Note: The log_alert
tables in the gpperfmon
database follow a different process, since alert messages are delivered by the Greenplum Database system logger instead of through gpsmon
. See Alert Log Processing and Log Rotation for more information.
Two configuration parameters in the $MASTER_DATA_DIRECTORY/gpperfmon/conf/gpperfmon.conf
configuration file control how often gpmmon
activities are triggered:
quantum
parameter is how frequently, in seconds, gpmmon
requests data from the gpsmon
agents on the segment hosts and adds retrieved data to the _now
and _tail
external table data files. Valid values for the quantum
parameter are 10, 15, 20, 30, and 60. The default is 15.harvest_interval
parameter is how frequently, in seconds, data in the _tail
tables is moved to the _history
tables. The harvest_interval
must be at least 30. The default is 120.See the gpperfmon_install
management utility reference in the Greenplum Database Utility Guide for the complete list of gpperfmon configuration parameters.
The following steps describe the flow of data from Greenplum Database into the gpperfmon
database when gpperfmon support is enabled.
gp_gpperfmon_send_interval
server configuration variable determines how frequently the database sends these messages. The default is every second.gpsmon
process on each host receives the UDP packets, consolidates and summarizes the data they contain, and adds additional host metrics, such as CPU and memory usage.gpsmon
processes continue to accumulate data until they receive a dump command from gpmmon
.gpsmon
processes respond to a dump command by sending their accumulated status data and log alerts to a listening gpmmon
event handler thread.gpmmon
event handler saves the metrics to .txt
files in the $MASTER_DATA_DIRECTORY/gpperfmon/data
directory on the master host.At each quantum
interval (15 seconds by default), gpmmon
performs the following steps:
Sends a dump command to the gpsmon
processes.
Gathers and converts the .txt
files saved in the $MASTER_DATA_DIRECTORY/gpperfmon/data
directory into .dat
external data files for the _now
and _tail
external tables in the gpperfmon
database.
For example, disk space metrics are added to the diskspace_now.dat
and _diskspace_tail.dat
delimited text files. These text files are accessed via the diskspace_now
and _diskspace_tail
tables in the gpperfmon
database.
At each harvest_interval
(120 seconds by default), gpmmon
performs the following steps for each _tail
file:
Renames the _tail
file to a _stage
file.
Creates a new _tail
file.
Appends data from the _stage
file into the _tail
file.
Runs a SQL command to insert the data from the _tail
external table into the corresponding _history
table.
For example, the contents of the _database_tail
external table is inserted into the database_history
regular (heap) table.
Deletes the _tail
file after its contents have been loaded into the database table.
Gathers all of the gpdb-alert-*.csv
files in the $MASTER_DATA_DIRECTORY/gpperfmon/logs
directory (except the most recent, which the syslogger has open and is writing to) into a single file, alert_log_stage
.
Loads the alert_log_stage
file into the log_alert_history
table in the gpperfmon
database.
Truncates the alert_log_stage
file.
The following topics describe the contents of the tables in the gpperfmon
database.
Parent topic: Greenplum Database Reference Guide