The citext module provides a case-insensitive character string type,
citext. Essentially, it internally calls the
lower function when comparing values. Otherwise, it behaves almost exactly like the
text data type.
The standard method to perform case-insensitive matches on text values is to use the
lower function when comparing values, for example
SELECT * FROM tab WHERE lower(col) = LOWER(?);
This method works well, but has drawbacks:
loweron both the column and the query value.
citext data type allows you to eliminate calls to
lower in SQL queries and you can create case-insensitive indexes on columns of type
citext is locale-aware, like the
text type, which means comparing uppercase and lowercase characters depends on the rules of the LC_CTYPE locale setting. This behavior is the same as using
lower in queries, but it is done transparently by the data type, so you do not have to do anything special in your queries.
Parent topic: Additional Supplied Modules
Before you can use the
citext data type, run the installation script
$GPHOME/share/postgresql/contrib/citext.sql in each database where you want to use the type:
$ psql -d testdb -f $GPHOME/share/postgresql/contrib/citext.sql
Here is a simple example defining a
citext table column:
CREATE TABLE users ( id bigint PRIMARY KEY, nick CITEXT NOT NULL, pass TEXT NOT NULL ) DISTRIBUTED BY (id); INSERT INTO users VALUES (1, 'larry', md5(random()::text) ); INSERT INTO users VALUES (2, 'Tom', md5(random()::text) ); INSERT INTO users VALUES (3, 'Damian', md5(random()::text) ); INSERT INTO users VALUES (4, 'NEAL', md5(random()::text) ); INSERT INTO users VALUES (5, 'Bjørn', md5(random()::text) ); SELECT * FROM users WHERE nick = 'Larry';
SELECT statement returns one tuple, even though the
nick column is set to
larry and the query specified
citext performs comparisons by converting each string to lower case (as though the
lower function were called) and then comparing the results normally. Two strings are considered equal if
lower would produce identical results for them.
In order to emulate a case-insensitive collation as closely as possible, there are
citext-specific versions of a number of string-processing operators and functions. So, for example, the regular expression operators
~* exhibit the same behavior when applied to
citext: they both match case-insensitively. The same is true for
!~*, as well as for the
!~~*. If you want to match case-sensitively, you can cast the operator's arguments to
The following functions perform matching case-insensitively if their arguments are
For the regexp functions, if you want to match case-sensitively, you can specify the “c” flag to force a case-sensitive match. If you want case-sensitive behavior, you must cast to
text before using one of these functions.
A column of type
citext cannot be part of a primary key or distribution key in a
CREATE TABLE statement.
citext type's case-folding behavior depends on the
LC_CTYPE setting of your database. How it compares values is therefore determined when the database is created. It is not truly case-insensitive in the terms defined by the Unicode standard. Effectively, what this means is that, as long as you're happy with your collation, you should be happy with
citext's comparisons. But if you have data in different languages stored in your database, users of one language may find their query results are not as expected if the collation is for another language.
citext is not as efficient as
text because the operator functions and the B-tree comparison functions must make copies of the data and convert them to lower case for comparisons. It is, however, slightly more efficient than using
lower to perform case-insensitive matching.
citext may not be the best option if you need data to compare case-sensitively in some contexts and case-insensitively in other contexts. The standard recommendation is to use the
text type and manually apply the
lower function when you need to compare case-insensitively. This works if case-insensitive comparison is needed only infrequently. If you need case-insensitive behavior most of the time and case-sensitive infrequently, consider storing the data as
citext and explicitly casting the column to
text when you want case-sensitive comparison. In either situation, you will need two indexes if you want both types of searches to be fast.
The schema containing the
citext operators must be in the current
public); if it is not, the normal case-sensitive
text operators will be invoked instead.