Defines a new operator.
CREATE OPERATOR <name> (
PROCEDURE = <funcname>
[, LEFTARG = <lefttype>] [, RIGHTARG = <righttype>]
[, COMMUTATOR = <com_op>] [, NEGATOR = <neg_op>]
[, RESTRICT = <res_proc>] [, JOIN = <join_proc>]
[, HASHES] [, MERGES] )
CREATE OPERATOR
defines a new operator. The user who defines an operator becomes its owner.
The operator name is a sequence of up to NAMEDATALEN
-1 (63 by default) characters from the following list: + - * / < > = ~ ! @ # % ^ & |
?`
There are a few restrictions on your choice of name:
--
and /*
cannot appear anywhere in an operator name, since they will be taken as the start of a comment.+
or -
, unless the name also contains at least one of these characters: ~ ! @ # % ^ & |
?`For example, @-
is an allowed operator name, but *-
is not. This restriction allows Greenplum Database to parse SQL-compliant commands without requiring spaces between tokens.
The operator !=
is mapped to <>
on input, so these two names are always equivalent.
At least one of LEFTARG
and RIGHTARG
must be defined. For binary operators, both must be defined. For right unary operators, only LEFTARG
should be defined, while for left unary operators only RIGHTARG
should be defined.
The funcname procedure must have been previously defined using CREATE FUNCTION
, must be IMMUTABLE
, and must be defined to accept the correct number of arguments (either one or two) of the indicated types.
The other clauses specify optional operator optimization clauses. These clauses should be provided whenever appropriate to speed up queries that use the operator. But if you provide them, you must be sure that they are correct. Incorrect use of an optimization clause can result in server process crashes, subtly wrong output, or other unexpected results. You can always leave out an optimization clause if you are not sure about it.
IMMUTABLE
function).
COMMUTATOR
clause names an operator that is the commutator of the operator being defined. We say that operator A is the commutator of operator B if (x A y) equals (y B x) for all possible input values x, y. Notice that B is also the commutator of A. For example, operators
<
and
>
for a particular data type are usually each others commutators, and operator + is usually commutative with itself. But operator
-
is usually not commutative with anything. The left operand type of a commutable operator is the same as the right operand type of its commutator, and vice versa. So the name of the commutator operator is all that needs to be provided in the
COMMUTATOR
clause.
NEGATOR
clause names an operator that is the negator of the operator being defined. We say that operator A is the negator of operator B if both return Boolean results and (x A y) equals NOT (x B y) for all possible inputs x, y. Notice that B is also the negator of A. For example,
<
and
>=
are a negator pair for most data types. An operator's negator must have the same left and/or right operand types as the operator to be defined, so only the operator name need be given in the
NEGATOR
clause.
The optional RESTRICT
names a restriction selectivity estimation function for the operator. Note that this is a function name, not an operator name. RESTRICT
clauses only make sense for binary operators that return boolean
. The idea behind a restriction selectivity estimator is to guess what fraction of the rows in a table will satisfy a WHERE
-clause condition of the form:
column OP constant
WHERE
clauses that have this form.
You can usually just use one of the following system standard estimator functions for many of your own operators:
eqsel
for =
neqsel
for <>
scalarltsel
for < or ⇐
scalargtsel
for > or >=
The optional JOIN
clause names a join selectivity estimation function for the operator. Note that this is a function name, not an operator name. JOIN
clauses only make sense for binary operators that return boolean
. The idea behind a join selectivity estimator is to guess what fraction of the rows in a pair of tables will satisfy a WHERE
-clause condition of the form
table1.column1 OP table2.column2
You can usually just use one of the following system standard join selectivity estimator functions for many of your own operators:
eqjoinsel
for =
neqjoinsel
for <>
scalarltjoinsel
for < or ⇐
scalargtjoinsel
for > or >=
areajoinsel
for 2D area-based comparisons
positionjoinsel
for 2D position-based comparisons
contjoinsel
for 2D containment-based comparisons
The optional HASHES
clause tells the system that it is permissible to use the hash join method for a join based on this operator. HASHES
only makes sense for a binary operator that returns boolean
. The hash join operator can only return true for pairs of left and right values that hash to the same hash code. If two values are put in different hash buckets, the join will never compare them, implicitly assuming that the result of the join operator must be false. Because of this, it never makes sense to specify HASHES
for operators that do not represent equality.
In most cases, it is only practical to support hashing for operators that take the same data type on both sides. However, you can design compatible hash functions for two or more data types, which are functions that will generate the same hash codes for "equal" values, even if the values are differently represented.
To be marked HASHES
, the join operator must appear in a hash index operator class. Attempts to use the operator in hash joins will fail at run time if no such operator class exists. The system needs the operator class to find the data-type-specific hash function for the operator's input data type. You must also supply a suitable hash function before you can create the operator class. Exercise care when preparing a hash function, as there are machine-dependent ways in which it could fail to function correctly. For example, on machines that meet the IEEE floating-point standard, negative zero and positive zero are different values (different bit patterns) but are defined to compare as equal. If a float value could contain a negative zero, define it to generate the same hash value as positive zero.
A hash-joinable operator must have a commutator (itself, if the two operand data types are the same, or a related equality operator if they are different) that appears in the same operator family. Otherwise, planner errors can occur when the operator is used. For better optimization, a hash operator family that supports multiple data types should provide equality operators for every combination of the data types.
Note: The function underlying a hash-joinable operator must be marked immutable or stable; an operator marked as volatile will not be used. If a hash-joinable operator has an underlying function that is marked strict, the function must also be complete, returning true or false, and not null, for any two non-null inputs.
The MERGES
clause, if present, tells the system that it is permissible to use the merge-join method for a join based on this operator. MERGES
only makes sense for a binary operator that returns boolean
, and in practice the operator must represent equality for some data type or pair of data types.
Merge join is based on the idea of sorting the left- and right-hand tables into order and then scanning them in parallel. This means both data types must be capable of being fully ordered, and the join operator must be one that can only succeed for pairs of values that fall at equivalent places in the sort order. In practice, this means that the join operator must behave like an equality operator. However, you can merge-join two distinct data types so long as they are logically compatible. For example, the smallint-versus-integer
equality operator is merge-joinable. Only sorting operators that bring both data types into a logically compatible sequence are needed.
To be marked MERGES
, the join operator must appear as an equality member of a btree index operator family. This is not enforced when you create the operator, because the referencing operator family does not exist until later. However, the operator will not actually be used for merge joins unless a matching operator family can be found. The MERGE
flag thus acts as a suggestion to the planner to look for a matching operator family.
A merge-joinable operator must have a commutator that appears in the same operator family. This would be itself, if the two operand data types are the same, or a related equality operator if the data types are different. Without an appropriate commutator, planner errors can occur when the operator is used. Also, although not strictly required, a btree operator family that supports multiple data types should be able to provide equality operators for every combination of the data types; this allows better optimization.
Note: SORT1, SORT2, LTCMP, and GTCMP were formerly used to specify the names of sort operators associated with a merge-joinable operator. Information about associated operators is now found by looking at B-tree operator families; specifying any of these operators will be ignored, except that it will implicitly set MERGES to true.
Any functions used to implement the operator must be defined as IMMUTABLE
.
Here is an example of creating an operator for adding two complex numbers, assuming we have already created the definition of type complex
. First define the function that does the work, then define the operator:
CREATE FUNCTION complex_add(complex, complex)
RETURNS complex
AS 'filename', 'complex_add'
LANGUAGE C IMMUTABLE STRICT;
CREATE OPERATOR + (
leftarg = complex,
rightarg = complex,
procedure = complex_add,
commutator = +
);
To use this operator in a query:
SELECT (a + b) AS c FROM test_complex;
CREATE OPERATOR
is a Greenplum Database language extension. The SQL standard does not provide for user-defined operators.
CREATE FUNCTION, CREATE TYPE, ALTER OPERATOR, DROP OPERATOR
Parent topic: SQL Command Reference