VMware Greenplum Text administration includes security considerations, monitoring Solr index statistics, managing and monitoring ZooKeeper, and troubleshooting.
VMware Greenplum Text deploys Apache ZooKeeper and Apache Solr nodes on hosts in your VMware Greenplum network. Each node is a JVM server process listening for requests from other nodes. Use the gptext-state config
command to list the host and port for each ZooKeeper and Solr node and the memory configuration for Solr nodes.
$ gptext-state configs
20181112:12:38:26:018080 gptext-state:mdw:gpadmin-[INFO]:-Execute GPText state ...
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:-Check zookeeper cluster state ...
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:-Cluster Configurations.
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:----------------------------------------------------------
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:-JVM Min | Max Xms1024M | Xmx2048M
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:-Node information
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:----------------------------------
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- Host Node Name Port Solr Dir
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- sdw1 sdw1_solr:18983 18983 /data/gptext/solr0
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- sdw1 sdw1_solr:18984 18984 /data/gptext/solr1
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- sdw2 sdw2_solr:18983 18983 /data/gptext/solr0
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- sdw2 sdw2_solr:18984 18984 /data/gptext/solr1
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:-Zookeeper information
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:----------------------------------
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- Host Port Zookeeper Dir
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- mdw 2189 /data/zoo/zoo0
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- sdw2 2189 /data/zoo/zoo0
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:- sdw1 2189 /data/zoo/zoo0
20181112:12:38:27:018080 gptext-state:mdw:gpadmin-[INFO]:-Done.
You don't need these details to use the VMware Greenplum Text functions and utilities, but the information can be useful for monitoring and troubleshooting the cluster. For example, you can access the Solr Admin UI by browsing to the URL http://<hostname>:<port>
on any Solr node. See Using the Solr Administration Interface for information about the Solr Admin UI.
Configuration parameters used with VMware Greenplum Text are built-in to VMware Greenplum Text with default values. You set new values for the parameters in a VMware Greenplum session using the SET
command, the same way you set VMware Greenplum session parameters. When you enter the SET
command VMware Greenplum Text updates the value in ZooKeeper so that the change persists between database sessions.
Note: The custom_variable_classes
configuration parameter is removed in VMware Greenplum 6. You can set custom variables in a database session without error, so this step is not needed for VMware Greenplum 6.
With VMware Greenplum 4.x and 5.x, a one-time VMware Greenplum configuration change is needed so that VMware Greenplum allows you to set and display VMware Greenplum Text configuration parameters. Until you have performed this step, any attempt to set a VMware Greenplum Text parameter results in an "Unrecognized configuration parameter" error. You must declare a custom variable class for VMware Greenplum Text.
As the gpadmin
user, enter the following commands in a shell:
$ gpconfig -c custom_variable_classes -v 'gptext'
$ gpstop -u
Once this step is completed, you can view and set VMware Greenplum Text configuration parameters in psql.
To view VMware Greenplum Text configuration parameters, you first need to fetch them from ZooKeeper into your VMware Greenplum session by executing the gptext.version()
UDF.
=# SELECT gptext.version();
version
------------------------------------------------------
Greenplum Text Analytics 3.2.0
(1 row)
Then you can use the SHOW
command to display values of the parameters, for example:
=# SHOW gptext.idx_num_shards;
gptext.idx_num_shards
-----------------------
0
(1 row)
See VMware Greenplum Text Configuration Parameters for a complete list of configuration parameters.
VMware Greenplum Text uses the current values of the configuration parameters when you create a new index, so changing a configuration parameter affects new indexes, but does not affect existing indexes.
Change the values of VMware Greenplum Text configuration variables using the SET
command in a session with a database that contains the VMware Greenplum Text schema. The following example sets values for three configuration parameters in a psql
session:
=# set gptext.idx_buffer_size=10485760;
SET
=# set gptext.idx_delim='|';
SET
=# set gptext.extension_factor=5;
SET
You can view the new value of a configuration parameter that you have set using the SHOW
command:
=# show gptext.idx_delim;
gptext.idx_delim
------------------
|
(1 row)
VMware Greenplum Text security is based on VMware Greenplum security. Your privileges to execute VMware Greenplum Text functions depend on your privileges for the database table that is the source for the index. For example, if you have SELECT privileges for a table in the VMware Greenplum, then you have SELECT privileges for an index generated from that table.
Executing VMware Greenplum Text functions requires one of OWNER, SELECT, INSERT, UPDATE, or DELETE privileges, depending on the function. The OWNER is the person who created the table and has all privileges. See the VMware Greenplum Administrator Guide for information about setting privileges.
The gptext-auth
utility enables and deactivates user password authentication for a single user account for the SolrCloud cluster web user interface (UI).
To avoid disruption, enable SolrCloud web authentication during the VMware Greenplum Text installation phase, by editing the gptext_install_config
file. See Install the VMware Greenplum Text Binary Distribution.
Note: Enabling authentication on a running cluster, changing the password, or deactivating authentication triggers a VMware Greenplum Text cluster reboot.
The following options are available:
Enable password authentication
$ gptext-auth enable-password --username <username> --password <password>
or input the password on the terminal, similar to:
$ gptext-auth enable-password --username <username>
Please input password:
The command asks for user input (y or n) before continuing. --username
is optional and if not provided, the default user account is solr
.
NOTE: Enabling authentication triggers a restart of the VMware Greenplum Text cluster.
Deactivate password authentication
$ gptext-auth disable-password
NOTE: Deactivating authentication triggers a restart of the VMware Greenplum Text cluster.
Change password
$ gptext-auth change-password --old-password <oldpassword> --new-password <newpassword>
or
$ gptext-auth change-password
Please input old password:
Please input new password:
NOTE: Changing the password triggers a restart of the VMware Greenplum Text cluster.
See the gptext-auth reference page for more information about the command options.
Apache ZooKeeper enables coordination between the Apache Solr and VMware Greenplum Text distributed processes through a shared namespace that resembles a file system. In ZooKeeper, a node (called a znode) can contain data, like a file, and can have child znodes, like a directory. ZooKeeper replicates data between multiple instances deployed as a cluster to provide a highly available, fault-tolerant service. Both Solr and VMware Greenplum Text store configuration files and share status by writing data to ZooKeeper znodes. VMware Greenplum Text stores information in the /gptext
znode. The configuration files for a VMware Greenplum Text index are in the /gptext/configs/<index-name>
znode.
The number of ZooKeeper instances in the cluster determines how many ZooKeeper node failures the cluster can tolerate and still remain active. The service remains available as long as a clear majority of the non-failed nodes are able to communicate with each other. To tolerate a failure of n nodes the cluster must have 2n+1 nodes. A cluster of five nodes, for example, can tolerate two failed nodes.
ZooKeeper is very fast for read requests because it stores data in memory. If ZooKeeper begins to swap memory to disk, Solr and VMware Greenplum Text performance will decrease and could experience failures, so it is critical to allocate sufficient memory to the ZooKeeper Java processes. To avoid ZooKeeper instances competing with VMware Greenplum segments for memory, you should deploy the ZooKeeper instances and VMware Greenplum segments on different hosts. The ZooKeeper and VMware Greenplum hosts must be on the same network and accessible with passwordless SSH by the gpadmin user. You can use the VMware Greenplum gpssh-exkeys
utility to share SSH keys between ZooKeeper and VMware Greenplum hosts.
You must start the ZooKeeper cluster before you start VMware Greenplum Text. When you start VMware Greenplum Text, the Solr nodes each load the replicas for indexes they manage. With large numbers of indexes, shards, and replicas, starting up the cluster can generate a very high, atypical load on ZooKeeper. It can take a long time to get all indexes loaded and some ZooKeeper requests may time out waiting for responses. Using the gptext-start --slow_start
option starts Solr nodes one at a time, providing a more ordered start-up and limiting the number of concurrent ZooKeeper requests.
The VMware Greenplum Text command-line utility zkManager
can be used to monitor the ZooKeeper cluster. If the ZooKeeper cluster is bound to VMware Greenplum Text, you can also start and stop the cluster using zkManager
.
Use the zkManager
utility from the command line to check the ZooKeeper cluster status. The utility lists the hosts, ports, latency, and follower/leader mode for each ZooKeeper instance. If a node is down, its mode is listed as Down.
To check the ZooKeeper cluster status, run the zkManager state
command.
$ zkManager state
20171016:12:59:47:026338 zkManager:gpdb:gpadmin-[INFO]:-Execute zookeeper state process.
20171016:12:59:47:026338 zkManager:gpdb:gpadmin-[INFO]:-Check zookeeper cluster state ...
20171016:12:59:47:026338 zkManager:gpdb:gpadmin-[INFO]:- Host port Latency min/avg/max Mode
20171016:12:59:47:026338 zkManager:gpdb:gpadmin-[INFO]:- gpdb 2189 0/0/22 follower
20171016:12:59:47:026338 zkManager:gpdb:gpadmin-[INFO]:- gpdb 2190 0/0/29 leader
20171016:12:59:47:026338 zkManager:gpdb:gpadmin-[INFO]:- gpdb 2188 0/0/27 follower
20171016:12:59:47:026338 zkManager:gpdb:gpadmin-[INFO]:-Done.
In a database session, you can use the gptext.zookeeper_hosts()
function to list the ZooKeeper hosts.
=# SELECT * FROM gptext.zookeeper_hosts();
host | port
--------+------
gpdb51 | 2188
gpdb51 | 2189
gpdb51 | 2190
(3 rows)
If the ZooKeeper cluster was installed by the VMware Greenplum Text installer, the zkManager
utility can start or stop the ZooKeeper cluster. To start the cluster, run the zkManager start
command.
$ zkManager start
20171016:16:14:46:017845 zkManager:gpdb:gpadmin-[INFO]:-Execute zookeeper start process
20171016:16:14:46:017845 zkManager:gpdb:gpadmin-[INFO]:------------------------------------------------
20171016:16:14:46:017845 zkManager:gpdb:gpadmin-[INFO]:-Starting Zookeeper:
20171016:16:14:46:017845 zkManager:gpdb:gpadmin-[INFO]:------------------------------------------------
20171016:16:14:46:017845 zkManager:gpdb:gpadmin-[INFO]:- Host Zookeeper Dir
20171016:16:14:46:017845 zkManager:gpdb:gpadmin-[INFO]:- gpdb /data/master/zoo0
20171016:16:14:46:017845 zkManager:gpdb:gpadmin-[INFO]:- gpdb /data/master/zoo1
20171016:16:14:46:017845 zkManager:gpdb:gpadmin-[INFO]:- gpdb /data/master/zoo2
20171016:16:14:48:017845 zkManager:gpdb:gpadmin-[INFO]:-Check zookeeper cluster state ...
20171016:16:14:53:017845 zkManager:gpdb:gpadmin-[INFO]:-Done.
To stop ZooKeeper, run the zkManager stop
command.
$ zkManager stop
20171016:16:14:08:016499 zkManager:gpdb:gpadmin-[INFO]:-Execute zookeeper stop process.
20171016:16:14:08:016499 zkManager:gpdb:gpadmin-[INFO]:------------------------------------------------
20171016:16:14:08:016499 zkManager:gpdb:gpadmin-[INFO]:-Stop Zookeeper:
20171016:16:14:08:016499 zkManager:gpdb:gpadmin-[INFO]:------------------------------------------------
20171016:16:14:08:016499 zkManager:gpdb:gpadmin-[INFO]:- Host Zookeeper Dir
20171016:16:14:08:016499 zkManager:gpdb:gpadmin-[INFO]:- gpdb /data/master/zoo0
20171016:16:14:08:016499 zkManager:gpdb:gpadmin-[INFO]:- gpdb /data/master/zoo1
20171016:16:14:08:016499 zkManager:gpdb:gpadmin-[INFO]:- gpdb /data/master/zoo2
20171016:16:14:09:016499 zkManager:gpdb:gpadmin-[INFO]:-Done.
See the zkManager reference for more information.
You can check the status of the SolrCloud cluster and indexes by running the gptext-state
utility from the command line.
To check the state of the VMware Greenplum Text nodes and each index, run the gptext-state
utility with the -D
(--details
) option. Example:
$ gptext-state -D
20180615:16:09:24:031986 gptext-state:mdw:gpadmin-[INFO]:-Execute GPText state ...
20180615:16:09:25:031986 gptext-state:mdw:gpadmin-[INFO]:-Check zookeeper cluster state ...
20180615:16:09:25:031986 gptext-state:mdw:gpadmin-[INFO]:-Check GPText cluster status...
20180615:16:09:25:031986 gptext-state:mdw:gpadmin-[INFO]:-Current GPText Version: 3.0.0
20180615:16:09:25:031986 gptext-state:mdw:gpadmin-[INFO]:-All nodes are up and running.
20180615:16:09:26:031986 gptext-state:mdw:gpadmin-[INFO]:------------------------------------------------
20180615:16:09:26:031986 gptext-state:mdw:gpadmin-[INFO]:-Index state details.
20180615:16:09:26:031986 gptext-state:mdw:gpadmin-[INFO]:------------------------------------------------
20180615:16:09:26:031986 gptext-state:mdw:gpadmin-[INFO]:- database index name state
20180615:16:09:26:031986 gptext-state:mdw:gpadmin-[INFO]:- demo demo.twitter.message Green
20180615:16:09:26:031986 gptext-state:mdw:gpadmin-[INFO]:- demo demo.wikipedia.articles Green
20180615:16:09:26:031986 gptext-state:mdw:gpadmin-[INFO]:-Done.
This command reports the status of the VMware Greenplum Text nodes and status of each VMware Greenplum Text index.
Run gptext-state list
to view just the indexes.
The gptext-state healthcheck
command checks the VMware Greenplum Text configuration files, the index status, required disk space, user privileges, and index and database consistency. By default, the required disk space check passes if there is at least 20% disk free. You can set a different disk free threshold using the --disk_free
option. For example:
[gpadmin@gpdb-sandbox ~]$ gptext-state healthcheck --disk_free=25
20160629:15:45:24:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-Execute healthcheck on GPText cluster!
20160629:15:45:24:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-Check GPText config files ...
20160629:15:45:24:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-GOOD
20160629:15:45:24:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-Check GPText index status ...
20160629:15:45:25:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-GOOD
20160629:15:45:25:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-Checking for required disk space...
20160629:15:45:25:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-GOOD
20160629:15:45:25:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-Checking for required user privileges...
20160629:15:45:25:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-GOOD
20160629:15:45:25:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-Checking for indexes and database consistency...
20160629:15:45:27:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-GOOD
20160629:15:45:27:669652 gptext-state:gpdb-sandbox:gpadmin-[INFO]:-Done.
See the gptext-state
utility reference for additional options.
From VMware Greenplum Text 3.6.0 you may start and stop individual Solrcloud nodes, or a group of nodes.
To stop a Solrcloud node, run the gptext-stop
command:
$ gptext-stop --nodes "mdw:18983_solr, sdw1:18983_solr"
Where:
-n|--nodes
is a comma separated list of nodes to stop. The node name is specified in the format <host>:<port>_solr
.The gptext-stop
command is interactive and requires y
or n
user input to continue, similar to:
$ gptext-stop -n "test-server3:18983_solr, test-server3:18984_solr"
20210120:03:34:36:010966 gptext-stop:test-server:gpadmin-[INFO]:-Execute GPText cluster stop.
20210120:03:34:36:010966 gptext-stop:test-server:gpadmin-[INFO]:-Check zookeeper cluster state ...
20210120:03:34:37:010966 gptext-stop:test-server:gpadmin-[WARNING]:-Stop some of the Solr nodes might make some indices turns into yellow/red state. Replica recovery is expected after the nodes are up, please make sure there is no new data indexing during the nodes restart.
Solr nodes will be stopped. Do you want to continue ? (y/n): y
To start a Solrcloud node, run the gptext-start
command:
$ gptext-start --nodes "mdw:18983_solr, sdw1:18983_solr"
Where:
-n|--nodes
is a comma separated list of nodes to start. The node name is specified in the format <host>:<port>_solr
.Use the gptext-recover
utility to recover down VMware Greenplum Text nodes, for example after a failed VMware Greenplum segment host is recovered.
With no arguments, the gptext-recover
utility discovers down VMware Greenplum Text nodes and restarts them.
With the -f
(or --force
) option, if a VMware Greenplum Text node cannot be restarted and no shards are down, the node is deleted and created again on the same host. Missing replicas are added and the failed node and failed replicas are removed. If the index is in a red state gptext-recover -f
will print a message and exit.
The -H
(--new_hosts
) option allows recreating down VMware Greenplum Text nodes on new hosts that replace failed hosts. The down VMware Greenplum Text nodes are deleted and recreated on the new hosts. The argument to the -H
option is a comma-separated list of the new hosts that are to replace the failed hosts. The number of new hosts must match the number of failed hosts. If shards are down, it advises reindexing. If only some replicas are down, it recreates the replicas on the new hosts and updates gptext.conf
.
The -r
option recovers replicas, but does not attempt to recover any down nodes.
Note: Before recovering VMware Greenplum Text nodes on newly added hosts, ensure that the following VMware Greenplum Text prerequisites have been installed on the host:
lsof
utilityYou can view Solr index statistics by running the gptext-state
utility from the command line.
To list all VMware Greenplum Text indexes, enter the following command at the command line:
gptext-state list
A command line that retrieves all statistics for an index:
gptext-state --index demo.wikipedia.articles
A command line that retrieves the number of documents in an index:
gptext-state --index demo.wikipedia.articles --stats_columns=num_docs
A command line that retrieves num_docs
, index size
, and the date and time last_modified
:
gptext-state --index demo.wikipedia.articles --stats_columns num_docs,size,last_modified
With the gptext-backup
management utility, you can back up a VMware Greenplum Text index so that, if needed, you can quickly recover from a failure. The backup can be restored to the same VMware Greenplum Text system or to another system with the same number of VMware Greenplum segments.
The gptext-backup
management utility backs up an index and its configuration files to either a shared file system, which must be mounted on and writable by each host in the VMware Greenplum cluster, or to local storage on the VMware Greenplum master and segment hosts.
To back up on a shared file system, use the -p
(--path
) command-line option to specify the location of a directory on the mounted file system and the -n
(--name
) option to provide a name for the backup. Specify the index to backup with the -i
(--index
) option.
$ gptext-backup -i <index-name> -p <path> --n <backup-name>
The gptext-backup
utility then checks that:
-n
option does not already exist in the directory specified with the -p
optionThe utility creates the new directory and then saves one copy of each index shard to that directory, along with the index's configuration files from ZooKeeper.
To save the configuration files only, with no data, add the -c
(--backup_conf
) command-line option.
To restore an index from a shared file system, use the gptext-restore
management utility. The VMware Greenplum Text system you restore to must be on a VMware Greenplum cluster with the same number of segments. The database and schema for the index must be present.
The -i
(--index
) option specifies the name of the VMware Greenplum Text index that will be restored. If the index exists, you must first drop it with the gptext.drop_index()
user-defined function.
The -p
(--path
) option specifies the location of the directory containing the backup files—the directory that gptext-backup
created on the shared file system.
$ gptext-restore -i <index-name> -p <path>
You can add the -c
option to restore only the configuration files to ZooKeeper and create an empty VMware Greenplum Text index, without restoring any saved index data.
To back up to local storage on the VMware Greenplum cluster, add the local
keyword to the gptext-backup
command-line.
A local VMware Greenplum Text backup has a unique name constructed by appending a timestamp to the index name. You do not use the -n
option with local backups.
$ gptext-backup local -i <index-name>
On the master host, in the master data directory by default, the backup utility saves a JSON file with backup metadata and a directory containing the index's configuration files from ZooKeeper.
The utility backs up each index shard on the VMware Greenplum segment host with the VMware Greenplum Text node that manages the shard's lead replica. By default, the shard backup files are saved in a segment data directory.
The gptext-backup
command output reports the locations of all backup files.
You can add the -p
(--path
) option to the gptext-backup
command to specify a local directory where the backup will be saved. The directory must be present on every VMware Greenplum host and must be writeable by the gpadmin user.
$ gptext-backup local -i <index-name> -p <path>
The backup files will be saved in the specified directory on each host instead of in the VMware Greenplum master and segment data directories.
To restore a backup saved to local storage, add the local
keyword to the gptext-restore
command-line and specify the path to the backup directory on the master host.
$ gptext-restore local -p <path>
The <path>
is the full path to the directory the gptext-backup
command created on the master host, including the timestamp, for example $MASTER_DATA_DIRECTORY/demo.twitter.message_2018-05-08T15:32:21.397779
.
See the gptext-backup refernce for syntax and examples for running gptext-backup
. See the gptext-restore reference for syntax and examples for running gptext-restore
.
The gptext-expand
management utility adds VMware Greenplum Text nodes to the cluster. There are two ways to add nodes:
gpexpand
management utility to expand the VMware Greenplum system.To add nodes to existing segment hosts, run the gptext-expand
utility with a command like the following:
gptext-expand -e -p /data1/nodes,/data2/nodes
This example adds two VMware Greenplum Text nodes to each host.
The -e
(--existing
) option specifies that nodes are to be added to existing hosts.
The -p
(--expand_paths
) option provides a list of directories where the new nodes' data directories are to be created. These should be the same directories that contain the VMware Greenplum segment data directories and existing VMware Greenplum Text data directories. The number of directories in the list is the number of new nodes that are added.
A directory can be repeated in the directory list multiple times to increase the number of new VMware Greenplum Text nodes to create. For example, if there is currently one VMware Greenplum Text node per host in the /data1/nodes
directory, you could add three nodes with a command like the following:
gptext-expand -e -p /data1/nodes,/data2/nodes,/data2/nodes
This adds one node to the /data1/nodes
directory and two nodes to the /data2/nodes
directory so there are two VMware Greenplum Text nodes in each directory.
Adding VMware Greenplum Text nodes affects new indexes, but not existing indexes. Replicas for new indexes will be distributed across all of the nodes, including both old nodes and the newly created nodes. Replicas for indexes that existed before running gptext-expand
are not automatically moved. You can use the gptext-rebalance
command to relocate replicas to new nodes.
Check that the following VMware Greenplum Text prerequisites are installed on each new host added to the VMware Greenplum cluster:
lsof
utilityNew hosts must be reachable by all hosts in the VMware Greenplum Text cluster, including existing hosts and the new hosts you are adding.
After expanding the VMware Greenplum cluster with the gpexpand
management utility, call gptext-expand
with the -H
(--new_hosts
) option and a list of the new hosts on which to install VMware Greenplum Text:
gptext-expand -H newhost1,newhost2
The gptext-expand
utility installs VMware Greenplum Text binaries on the new hosts and then creates new VMware Greenplum Text nodes on the new hosts.
Newly created indexes will automatically be distributed among the new nodes. You can use the gptext-rebalance
command to relocate replicas to new nodes.
When expanding the VMware Greenplum Text cluster with new indexes, rebalance the replicas to the new nodes, and rebalance the replica leaders.
Use gptext-rebalance index
to rebalance the replicas for a specific index across all VMware Greenplum Text nodes.
$ gptext-rebalance index -i demo.public.test
See the gptext-rebalance
reference for more details about the options and the rebalance rules.
When some SolrCloud cluster nodes have more replica leaders than other nodes, use the gptext-rebalance leader
command to balance the leaders across the nodes.
To verify the state of the leaders in an index called demo.public.test
, use a SQL command like:
SELECT index_name, core, node_name, is_leader
FROM gptext.index_status()
WHERE index_name='demo.public.test';
The output is similar to:
index_name | core | node_name | is_leader
-------------------+-------------------------------------+--------------------+-----------
demo.public.test | demo.public.test_shard0_replica_n1 | gpadmin:18983_solr | t
demo.public.test | demo.public.test_shard0_replica_n2 | gpadmin:18984_solr | f
demo.public.test | demo.public.test_shard1_replica_n4 | gpadmin:18984_solr | f
demo.public.test | demo.public.test_shard1_replica_n7 | gpadmin:18983_solr | t
In this example, node 18983_solr
contains two replicas and node 18984_solr
none. Rebalance the leaders across the nodes using:
$ gptext-rebalance leader -i demo.public.test
The leaders are spread across the nodes similar to:
index_name | core | node_name | is_leader
-------------------+-------------------------------------+--------------------+-----------
demo.public.test | demo.public.test_shard0_replica_n1 | gpadmin:18983_solr | f
demo.public.test | demo.public.test_shard0_replica_n2 | gpadmin:18984_solr | t
demo.public.test | demo.public.test_shard1_replica_n4 | gpadmin:18984_solr | f
demo.public.test | demo.public.test_shard1_replica_n7 | gpadmin:18983_solr | t
VMware Greenplum Text errors are of the following types:
gptext
errorsMost of the Solr errors are self-explanatory.
gptext
errors are caused by misuse of a function or utility. They provide a message that tells you when you have used an incorrect function or argument.
You can examine the VMware Greenplum and Solr logs for more information if errors occur. VMware Greenplum logs reside in:
segment-directory/pg-log
Solr logs reside in:
<GPDB path>/solr/logs
Use the gptext-state
utility to determine if any primary or mirror segments are down. See gptext-state
in the VMware Greenplum Text Management Utilities Reference.