LOCATION clause of the
CREATE EXTERNAL TABLE command for HDFS files differs slightly for Hadoop HA (High Availability) clusters, Hadoop clusters without HA, and MapR clusters.
In a Hadoop HA cluster, the
LOCATION clause references the logical nameservices id (the
dfs.nameservices property in the
hdfs-site.xml configuration file). The
hdfs-site.xml file with the nameservices configuration must be installed on the Greenplum master and on each segment host.
For example, if
dfs.nameservices is set to
LOCATION clause takes this format:
A cluster without HA specifies the hostname and port of the name node in the
If you are using MapR clusters, you specify a specific cluster and the file:
To specify the default cluster, the first entry in the MapR configuration file
/opt/mapr/conf/mapr-clusters.conf, specify the location of your table with this syntax:
The file_path is the path to the file.
To specify another MapR cluster listed in the configuration file, specify the file with this syntax:
The cluster_name is the name of the cluster specified in the configuration file and file_path is the path to the file.
For information about MapR clusters, see the MapR documentation.
HDFS files are as follows.
You can specify one path for a readable external table with
gphdfs. Wildcard characters are allowed. If you specify a directory, the default is all files in the directory.
You can specify only a directory for writable external tables.
The URI of the
LOCATION clause cannot contain any of these four characters:
CREATE EXTERNAL TABLE returns a an error if the URI contains any of the characters.
Format restrictions are as follows.
gphdfs_importformatter is allowed for readable external tables with a custom format.
gphdfs_exportformatter is allowed for writable external tables with a custom format.
Parent topic: Accessing HDFS Data with gphdfs (Deprecated)
Compression options for Hadoop Writable External Tables use the form of a URI query and begin with a question mark. Specify multiple compression options with an ampersand (
|Compression Option||Values||Default Value|
|codec||Codec class name||
||integer between 1 and 9||
The level controls the trade-off between speed and compression. Valid values are 1 to 9, where 1 is the fastest and 9 is the most compressed.
Place compression options in the query portion of the URI.