Defines a new operator.
CREATE OPERATOR <name> ( PROCEDURE = <funcname> [, LEFTARG = <lefttype>] [, RIGHTARG = <righttype>] [, COMMUTATOR = <com_op>] [, NEGATOR = <neg_op>] [, RESTRICT = <res_proc>] [, JOIN = <join_proc>] [, HASHES] [, MERGES] )
CREATE OPERATOR defines a new operator. The user who defines an operator becomes its owner.
The operator name is a sequence of up to
NAMEDATALEN-1 (63 by default) characters from the following list:
+ - * / < > = ~ ! @ # % ^ & | ?`
There are a few restrictions on your choice of name:
/*cannot appear anywhere in an operator name, since they will be taken as the start of a comment.
-, unless the name also contains at least one of these characters:
~ ! @ # % ^ & |?`
@- is an allowed operator name, but
*- is not. This restriction allows Greenplum Database to parse SQL-compliant commands without requiring spaces between tokens.
!= is mapped to
<> on input, so these two names are always equivalent.
At least one of
RIGHTARG must be defined. For binary operators, both must be defined. For right unary operators, only
LEFTARG should be defined, while for left unary operators only
RIGHTARG should be defined.
The funcname procedure must have been previously defined using
CREATE FUNCTION, must be
IMMUTABLE, and must be defined to accept the correct number of arguments (either one or two) of the indicated types.
The other clauses specify optional operator optimization clauses. These clauses should be provided whenever appropriate to speed up queries that use the operator. But if you provide them, you must be sure that they are correct. Incorrect use of an optimization clause can result in server process crashes, subtly wrong output, or other unexpected results. You can always leave out an optimization clause if you are not sure about it.
COMMUTATORclause names an operator that is the commutator of the operator being defined. We say that operator A is the commutator of operator B if (x A y) equals (y B x) for all possible input values x, y. Notice that B is also the commutator of A. For example, operators
>for a particular data type are usually each others commutators, and operator + is usually commutative with itself. But operator
-is usually not commutative with anything. The left operand type of a commutable operator is the same as the right operand type of its commutator, and vice versa. So the name of the commutator operator is all that needs to be provided in the
NEGATORclause names an operator that is the negator of the operator being defined. We say that operator A is the negator of operator B if both return Boolean results and (x A y) equals NOT (x B y) for all possible inputs x, y. Notice that B is also the negator of A. For example,
>=are a negator pair for most data types. An operator's negator must have the same left and/or right operand types as the operator to be defined, so only the operator name need be given in the
RESTRICT names a restriction selectivity estimation function for the operator. Note that this is a function name, not an operator name.
RESTRICT clauses only make sense for binary operators that return
boolean. The idea behind a restriction selectivity estimator is to guess what fraction of the rows in a table will satisfy a
WHERE-clause condition of the form:
column OP constant
WHEREclauses that have this form.
You can usually just use one of the following system standard estimator functions for many of your own operators:
eqsel for =
neqsel for <>
scalarltsel for < or ⇐
scalargtsel for > or >=
JOIN clause names a join selectivity estimation function for the operator. Note that this is a function name, not an operator name.
JOIN clauses only make sense for binary operators that return
boolean. The idea behind a join selectivity estimator is to guess what fraction of the rows in a pair of tables will satisfy a
WHERE-clause condition of the form
table1.column1 OP table2.column2
You can usually just use one of the following system standard join selectivity estimator functions for many of your own operators:
eqjoinsel for =
neqjoinsel for <>
scalarltjoinsel for < or ⇐
scalargtjoinsel for > or >=
areajoinsel for 2D area-based comparisons
positionjoinsel for 2D position-based comparisons
contjoinsel for 2D containment-based comparisons
HASHES clause tells the system that it is permissible to use the hash join method for a join based on this operator.
HASHES only makes sense for a binary operator that returns
boolean. The hash join operator can only return true for pairs of left and right values that hash to the same hash code. If two values are put in different hash buckets, the join will never compare them, implicitly assuming that the result of the join operator must be false. Because of this, it never makes sense to specify
HASHES for operators that do not represent equality.
In most cases, it is only practical to support hashing for operators that take the same data type on both sides. However, you can design compatible hash functions for two or more data types, which are functions that will generate the same hash codes for "equal" values, even if the values are differently represented.
To be marked
HASHES, the join operator must appear in a hash index operator class. Attempts to use the operator in hash joins will fail at run time if no such operator class exists. The system needs the operator class to find the data-type-specific hash function for the operator's input data type. You must also supply a suitable hash function before you can create the operator class. Exercise care when preparing a hash function, as there are machine-dependent ways in which it could fail to function correctly. For example, on machines that meet the IEEE floating-point standard, negative zero and positive zero are different values (different bit patterns) but are defined to compare as equal. If a float value could contain a negative zero, define it to generate the same hash value as positive zero.
A hash-joinable operator must have a commutator (itself, if the two operand data types are the same, or a related equality operator if they are different) that appears in the same operator family. Otherwise, planner errors can occur when the operator is used. For better optimization, a hash operator family that supports multiple data types should provide equality operators for every combination of the data types.
Note: The function underlying a hash-joinable operator must be marked immutable or stable; an operator marked as volatile will not be used. If a hash-joinable operator has an underlying function that is marked strict, the function must also be complete, returning true or false, and not null, for any two non-null inputs.
MERGES clause, if present, tells the system that it is permissible to use the merge-join method for a join based on this operator.
MERGES only makes sense for a binary operator that returns
boolean, and in practice the operator must represent equality for some data type or pair of data types.
Merge join is based on the idea of sorting the left- and right-hand tables into order and then scanning them in parallel. This means both data types must be capable of being fully ordered, and the join operator must be one that can only succeed for pairs of values that fall at equivalent places in the sort order. In practice, this means that the join operator must behave like an equality operator. However, you can merge-join two distinct data types so long as they are logically compatible. For example, the
smallint-versus-integer equality operator is merge-joinable. Only sorting operators that bring both data types into a logically compatible sequence are needed.
To be marked
MERGES, the join operator must appear as an equality member of a btree index operator family. This is not enforced when you create the operator, because the referencing operator family does not exist until later. However, the operator will not actually be used for merge joins unless a matching operator family can be found. The
MERGE flag thus acts as a suggestion to the planner to look for a matching operator family.
A merge-joinable operator must have a commutator that appears in the same operator family. This would be itself, if the two operand data types are the same, or a related equality operator if the data types are different. Without an appropriate commutator, planner errors can occur when the operator is used. Also, although not strictly required, a btree operator family that supports multiple data types should be able to provide equality operators for every combination of the data types; this allows better optimization.
Note: SORT1, SORT2, LTCMP, and GTCMP were formerly used to specify the names of sort operators associated with a merge-joinable operator. Information about associated operators is now found by looking at B-tree operator families; specifying any of these operators will be ignored, except that it will implicitly set MERGES to true.
Any functions used to implement the operator must be defined as
Here is an example of creating an operator for adding two complex numbers, assuming we have already created the definition of type
complex. First define the function that does the work, then define the operator:
CREATE FUNCTION complex_add(complex, complex) RETURNS complex AS 'filename', 'complex_add' LANGUAGE C IMMUTABLE STRICT; CREATE OPERATOR + ( leftarg = complex, rightarg = complex, procedure = complex_add, commutator = + );
To use this operator in a query:
SELECT (a + b) AS c FROM test_complex;
CREATE OPERATOR is a Greenplum Database language extension. The SQL standard does not provide for user-defined operators.
Parent topic: SQL Command Reference