Defines a new function.
CREATE [OR REPLACE] FUNCTION <name>
( [ [<argmode>] [<argname>] <argtype> [ { DEFAULT | = } <defexpr> ] [, ...] ] )
[ RETURNS { [ SETOF ] rettype
| TABLE ([{ argname argtype | LIKE other table }
[, ...]])
} ]
{ LANGUAGE <langname>
| IMMUTABLE | STABLE | VOLATILE
| CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT
| NO SQL | CONTAINS SQL | READS SQL DATA | MODIFIES SQL
| [EXTERNAL] SECURITY INVOKER | [EXTERNAL] SECURITY DEFINER
| COST <execution_cost>
| SET <configuration_parameter> { TO <value> | = <value> | FROM CURRENT }
| AS '<definition>'
| AS '<obj_file>', '<link_symbol>' } ...
[ WITH ({ DESCRIBE = describe_function
} [, ...] ) ]
CREATE FUNCTION
defines a new function. CREATE OR REPLACE FUNCTION
will either create a new function, or replace an existing definition.
The name of the new function must not match any existing function with the same argument types in the same schema. However, functions of different argument types may share a name (overloading).
To update the definition of an existing function, use CREATE OR REPLACE FUNCTION
. It is not possible to change the name or argument types of a function this way (this would actually create a new, distinct function). Also, CREATE OR REPLACE FUNCTION
will not let you change the return type of an existing function. To do that, you must drop and recreate the function. If you drop and then recreate a function, you will have to drop existing objects (rules, views, triggers, and so on) that refer to the old function. Use CREATE OR REPLACE FUNCTION
to change a function definition without breaking objects that refer to the function.
For more information about creating functions, see the User Defined Functions section of the PostgreSQL documentation.
Limited Use of VOLATILE and STABLE Functions
To prevent data from becoming out-of-sync across the segments in Greenplum Database, any function classified as STABLE
or VOLATILE
cannot be executed at the segment level if it contains SQL or modifies the database in any way. For example, functions such as random()
or timeofday()
are not allowed to execute on distributed data in Greenplum Database because they could potentially cause inconsistent data between the segment instances.
To ensure data consistency, VOLATILE
and STABLE
functions can safely be used in statements that are evaluated on and execute from the master. For example, the following statements are always executed on the master (statements without a FROM
clause):
SELECT setval('myseq', 201);
SELECT foo();
In cases where a statement has a FROM
clause containing a distributed table and the function used in the FROM
clause simply returns a set of rows, execution may be allowed on the segments:
SELECT * FROM foo();
One exception to this rule are functions that return a table reference (rangeFuncs
) or functions that use the refCursor
data type. Note that you cannot return a refcursor
from any kind of function in Greenplum Database.
IN
,
OUT
,
INOUT
, or
VARIADIC
. Only
OUT
arguments can follow an argument declared as
VARIADIC
. If omitted, the default is
IN
.
The data type(s) of the function's arguments (optionally schema-qualified), if any. The argument types may be base, composite, or domain types, or may reference the type of a table column.
Depending on the implementation language it may also be allowed to specify pseudotypes such as cstring
. Pseudotypes indicate that the actual argument type is either incompletely specified, or outside the set of ordinary SQL data types.
tablename.columnname%TYPE
. Using this feature can sometimes help make a function independent of changes to the definition of a table.
IN
and
INOUT
parameters can have a default value. Each input parameter in the argument list that follows a parameter with a default value must have a default value as well.
The return data type (optionally schema-qualified). The return type can be a base, composite, or domain type, or may reference the type of a table column. Depending on the implementation language it may also be allowed to specify pseudotypes such as cstring
. If the function is not supposed to return a value, specify void
as the return type.
When there are OUT
or INOUT
parameters, the RETURNS
clause may be omitted. If present, it must agree with the result type implied by the output parameters: RECORD
if there are multiple output parameters, or the same type as the single output parameter.
The SETOF
modifier indicates that the function will return a set of items, rather than a single item.
tablename.columnname%TYPE
.
SQL
,
C
,
internal
, or the name of a user-defined procedural language. See
CREATE LANGUAGE for the procedural languages supported in Greenplum Database. For backward compatibility, the name may be enclosed by single quotes.
These attributes inform the query optimizer about the behavior of the function. At most one choice may be specified. If none of these appear, VOLATILE
is the default assumption. Since Greenplum Database currently has limited use of VOLATILE
functions, if a function is truly IMMUTABLE
, you must declare it as so to be able to use it without restrictions.
IMMUTABLE
indicates that the function cannot modify the database and always returns the same result when given the same argument values. It does not do database lookups or otherwise use information not directly present in its argument list. If this option is given, any call of the function with all-constant arguments can be immediately replaced with the function value.
STABLE
indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups, parameter values (such as the current time zone), and so on. Also note that the current_timestamp family of functions qualify as stable, since their values do not change within a transaction.
VOLATILE
indicates that the function value can change even within a single table scan, so no optimizations can be made. Relatively few database functions are volatile in this sense; some examples are
random()
,
timeofday()
. But note that any function that has side-effects must be classified volatile, even if its result is quite predictable, to prevent calls from being optimized away; an example is
setval()
.
CALLED ON NULL INPUT
(the default) indicates that the function will be called normally when some of its arguments are null. It is then the function author's responsibility to check for null values if necessary and respond appropriately.
RETURNS NULL ON NULL INPUT
or
STRICT
indicates that the function always returns null whenever any of its arguments are null. If this parameter is specified, the function is not executed when there are null arguments; instead a null result is assumed automatically.
These attributes inform the query optimizer about whether or not the function contains SQL statements and whether, if it does, those statements read and/or write data.
NO SQL
indicates that the function does not contain SQL statements.
CONTAINS SQL
indicates that the function contains SQL statements, none of which either read or write data.
READS SQL DATA
indicates that the function contains SQL statements that read data but none that modify data.
MODIFIES SQL
indicates that the function contains statements that may write data.
SECURITY INVOKER
(the default) indicates that the function is to be executed with the privileges of the user that calls it.
SECURITY DEFINER
specifies that the function is to be executed with the privileges of the user that created it. The key word
EXTERNAL
is allowed for SQL conformance, but it is optional since, unlike in SQL, this feature applies to all functions not just external ones.
SET
clause applies a value to a session configuration parameter when the function is entered. The configuration parameter is restored to its prior value when the function exits.
SET FROM CURRENT
applies the session's current value of the parameter when the function is entered.
AS
clause is used for dynamically loadable C language functions when the function name in the C language source code is not the same as the name of the SQL function. The string obj_file is the name of the file containing the dynamically loadable object, and link_symbol is the name of the function in the C language source code. If the link symbol is omitted, it is assumed to be the same as the name of the SQL function being defined. It is recommended to locate shared libraries either relative to
$libdir
(which is located at
$GPHOME/lib
) or through the dynamic library path (set by the
dynamic_library_path
server configuration parameter). This simplifies version upgrades if the new installation is at a different location.
Any compiled code (shared library files) for custom functions must be placed in the same location on every host in your Greenplum Database array (master and all segments). This location must also be in the LD_LIBRARY_PATH
so that the server can locate the files. It is recommended to locate shared libraries either relative to $libdir
(which is located at $GPHOME/lib
) or through the dynamic library path (set by the dynamic_library_path
server configuration parameter) on all master segment instances in the Greenplum array.
The full SQL type syntax is allowed for input arguments and return value. However, some details of the type specification (such as the precision field for type numeric) are the responsibility of the underlying function implementation and are not recognized or enforced by the CREATE FUNCTION
command.
Greenplum Database allows function overloading. The same name can be used for several different functions so long as they have distinct argument types. However, the C names of all functions must be different, so you must give overloaded C functions different C names (for example, use the argument types as part of the C names).
Two functions are considered the same if they have the same names and input argument types, ignoring any OUT
parameters. Thus for example these declarations conflict:
CREATE FUNCTION foo(int) ...
CREATE FUNCTION foo(int, out text) ...
Functions that have different argument type lists are not considered to conflict at creation time, but if argument defaults are provided, they might conflict in use. For example, consider:
CREATE FUNCTION foo(int) ...
CREATE FUNCTION foo(int, int default 42) ...
The call foo(10)
, will fail due to the ambiguity about which function should be called.
When repeated CREATE FUNCTION
calls refer to the same object file, the file is only loaded once. To unload and reload the file, use the LOAD
command.
You must have the USAGE
privilege on a language to be able to define a function using that language.
It is often helpful to use dollar quoting to write the function definition string, rather than the normal single quote syntax. Without dollar quoting, any single quotes or backslashes in the function definition must be escaped by doubling them. A dollar-quoted string constant consists of a dollar sign ($
), an optional tag of zero or more characters, another dollar sign, an arbitrary sequence of characters that makes up the string content, a dollar sign, the same tag that began this dollar quote, and a dollar sign. Inside the dollar-quoted string, single quotes, backslashes, or any character can be used without escaping. The string content is always written literally. For example, here are two different ways to specify the string "Dianne's horse" using dollar quoting:
$$Dianne's horse$$
$SomeTag$Dianne's horse$SomeTag$
If a SET
clause is attached to a function, the effects of a SET LOCAL
command executed inside the function for the same variable are restricted to the function; the configuration parameter's prior value is still restored when the function exits. However, an ordinary SET
command (without LOCAL
) overrides the CREATE FUNCTION
SET
clause, much as it would for a previous SET LOCAL
command. The effects of such a command will persist after the function exits, unless the current transaction is rolled back.
If a function with a VARIADIC
argument is declared as STRICT
, the strictness check tests that the variadic array as a whole is non-null. PL/pgSQL will still call the function if the array has null elements.
Using Functions With Queries on Distributed Data
In some cases, Greenplum Database does not support using functions in a query where the data in a table specified in the FROM
clause is distributed over Greenplum Database segments. As an example, this SQL query contains the function func()
:
SELECT func(a) FROM table1;
The function is not supported for use in the query if all of the following conditions are met:
table1
is distributed over Greenplum Database segments.func()
reads or modifies data from distributed tables.func()
returns more than one row or takes an argument (a
) that comes from table1
.If any of the conditions are not met, the function is supported. Specifically, the function is supported if any of the following conditions apply:
func()
does not access data from distributed tables, or accesses data that is only on the Greenplum Database master.table1
is a master only table.func()
returns only one row and only takes input arguments that are constant values. The function is supported if it can be changed to require no input arguments.A very simple addition function:
CREATE FUNCTION add(integer, integer) RETURNS integer
AS 'select $1 + $2;'
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;
Increment an integer, making use of an argument name, in PL/pgSQL:
CREATE OR REPLACE FUNCTION increment(i integer) RETURNS
integer AS $$
BEGIN
RETURN i + 1;
END;
$$ LANGUAGE plpgsql;
Increase the default segment host memory per query for a PL/pgSQL function:
CREATE OR REPLACE FUNCTION function_with_query() RETURNS
SETOF text AS $$
BEGIN
RETURN QUERY
EXPLAIN ANALYZE SELECT * FROM large_table;
END;
$$ LANGUAGE plpgsql
SET statement_mem='256MB';
Use polymorphic types to return an ENUM
array:
CREATE TYPE rainbow AS ENUM('red','orange','yellow','green','blue','indigo','violet');
CREATE FUNCTION return_enum_as_array( anyenum, anyelement, anyelement )
RETURNS TABLE (ae anyenum, aa anyarray) AS $$
SELECT $1, array[$2, $3]
$$ LANGUAGE SQL STABLE;
SELECT * FROM return_enum_as_array('red'::rainbow, 'green'::rainbow, 'blue'::rainbow);
Return a record containing multiple output parameters:
CREATE FUNCTION dup(in int, out f1 int, out f2 text)
AS $$ SELECT $1, CAST($1 AS text) || ' is text' $$
LANGUAGE SQL;
SELECT * FROM dup(42);
You can do the same thing more verbosely with an explicitly named composite type:
CREATE TYPE dup_result AS (f1 int, f2 text);
CREATE FUNCTION dup(int) RETURNS dup_result
AS $$ SELECT $1, CAST($1 AS text) || ' is text' $$
LANGUAGE SQL;
SELECT * FROM dup(42);
CREATE FUNCTION
is defined in SQL:1999 and later. The Greenplum Database version is similar but not fully compatible. The attributes are not portable, neither are the different available languages.
For compatibility with some other database systems, argmode can be written either before or after argname. But only the first way is standard-compliant.
The SQL standard does not specify parameter defaults.
ALTER FUNCTION, DROP FUNCTION, LOAD
Parent topic: SQL Command Reference