This topic answers frequently asked questions about querying functionality in VMware Tanzu GemFire and provides examples to help you get started with Tanzu GemFire querying.
For additional information about Tanzu GemFire querying, see Querying.
To write and execute a query in Tanzu GemFire, you can use any of the following mechanisms. Sample query code follows.
Sample Tanzu GemFire Query Code (Java)
// Identify your query string.
String queryString = "SELECT * FROM /exampleRegion";
// Get QueryService from Cache.
QueryService queryService = cache.getQueryService();
// Create the query object.
Query query = queryService.newQuery(queryString);
// Execute Query locally. Returns results set.
SelectResults results = (SelectResults)query.execute();
// Find the Size of the ResultSet.
int size = results.size();
// Iterate through your ResultSet.
Portfolio p = (Portfolio)results.iterator().next(); /* Region containing Portfolio object. */
The following example query strings use the /exampleRegion
whose keys are the portfolio ID and whose values correspond to the summarized data shown in the following class definitions:
class Portfolio implements DataSerializable {
int ID;
String type;
String status;
Map positions;
}
class Position implements DataSerializable {
String secId;
double mktValue;
double qty;
}
Basic WHERE Clause Examples
In the following examples, the status field is type String and the ID field is type int. See Supported Literals for a complete list of literals supported in Tanzu GemFire querying.
Select all active portfolios.
SELECT * FROM /exampleRegion WHERE status = 'active'
Select all portfolios whose status begins with ‘activ’.
SELECT * FROM /exampleRegion p WHERE p.status LIKE 'activ%'
Select all portfolios whose ID is greater than 100.
SELECT * from /exampleRegion p WHERE p.ID > 100
Using DISTINCT
Select distinct Objects from the region that satisfy the where clause condition of status = ‘active’.
SELECT DISTINCT * FROM /exampleRegion WHERE status = 'active'
Aliases and Synonyms
In the query string, the path expressions (region and its objects) can be defined using an alias. This alias can be used or referred to in other places in the query.
SELECT DISTINCT * FROM /exampleRegion p WHERE p.status = 'active'
SELECT p.ID, p.status FROM /exampleRegion p WHERE p.ID > 0
Using the NOT Operator
See Operators for a complete list of supported operators.
SELECT DISTINCT * FROM /exampleRegion WHERE NOT (status = 'active') AND ID = 2
SELECT * FROM /exampleRegion WHERE NOT (ID IN SET(1,2))
Using the AND and OR Operators
See Operators for a complete list of supported operators.
SELECT * FROM /exampleRegion WHERE ID > 4 AND ID < 9
SELECT * FROM /exampleRegion WHERE ID = 0 OR ID = 1
SELECT DISTINCT p.status FROM /exampleRegion p
WHERE (p.createTime IN SET (10|) OR p.status IN SET ('active')) AND p.ID > 0
Using not equal to
SELECT * FROM /exampleRegion portfolio WHERE portfolio.ID <> 2
SELECT * FROM /exampleRegion portfolio WHERE portfolio.ID != 2
Projection attribute example
SELECT p.get('account') FROM /exampleRegion p
Querying nested collections
The following query uses Positions of type HashMap.
SELECT p, pos FROM /exampleRegion p, p.positions.values pos WHERE pos.secId = 'VMW'
Using LIMIT
SELECT * FROM /exampleRegion p WHERE p.ID > 0 LIMIT 2
Using MIN and MAX
See MIN and MAX for more information.
SELECT MIN(ID)
FROM /exampleRegion
WHERE ID > 0
SELECT MAX(ID)
FROM /exampleRegion
WHERE ID > 0 AND status LIKE 'act%'
SELECT MIN(pos.mktValue)
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID = 10
SELECT MAX(p.ID)
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 OR p.status = 'active' OR pos.secId = 'IBM'
Using AVG
See AVG for more information.
SELECT AVG(ID)
FROM /exampleRegion
WHERE ID > 0
SELECT AVG(ID)
FROM /exampleRegion
WHERE ID > 0 AND status LIKE 'act%'
SELECT AVG(pos.mktValue)
FROM /exampleRegion p, p.positions.values pos
WHERE p.isActive()
SELECT AVG(DISTINCT p.ID)
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 OR p.status = 'active' OR pos.secId = 'IBM'
Using COUNT
See COUNT for more information.
SELECT COUNT(*)
FROM /exampleRegion
WHERE ID > 0
SELECT COUNT(*)
FROM /exampleRegion
WHERE ID > 0 LIMIT 50
SELECT COUNT(*)
FROM /exampleRegion
WHERE ID > 0 AND status LIKE 'act%'
SELECT COUNT(*)
FROM /exampleRegion
WHERE ID IN SET(1,2,3,4,5)
SELECT COUNT(DISTINCT p.status)
FROM /exampleRegion p
WHERE p.ID > 0
SELECT COUNT(*)
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 AND pos.secId 'IBM'
SELECT DISTINCT COUNT(*)
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 OR p.status = 'active' OR pos.secId = 'IBM'
Using SUM
See SUM for more information.
SELECT SUM(ID)
FROM /exampleRegion
WHERE ID > 0
SELECT SUM(ID)
FROM /exampleRegion
WHERE ID > 0 AND status LIKE 'act%'
SELECT SUM(pos.mktValue)
FROM /exampleRegion p, p.positions.values pos
WHERE p.status = 'active'
SELECT SUM(DISTINCT p.ID)
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 OR p.status = 'active' OR pos.secId = 'IBM'
Using GROUP BY
See GROUP BY for more information.
SELECT p.status, MAX(p.ID)
FROM /exampleRegion p
WHERE p.ID > 0
GROUP BY p.status
SELECT p.ID, MIN(pos.qty) AS lessQty
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 AND p.status = 'active'
GROUP BY p.ID
ORDER BY lessQty ASC
SELECT p.ID, MAX(pos.mktValue) AS maxValue
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 AND p.status = 'active'
GROUP BY p.ID
ORDER BY maxValue DESC
SELECT p.status, AVG(p.ID)
FROM /exampleRegion p
WHERE p.ID > 0
GROUP BY p.status
SELECT p.ID, pos.secId, AVG(pos.mktValue)
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 OR p.status = 'active'
GROUP BY p.ID, pos.secId
SELECT p.status, AVG(p.ID) as sm
FROM /exampleRegion p
WHERE p.ID > 0
GROUP BY p.status
ORDER BY sm DESC
SELECT p.status, COUNT(*)
FROM /exampleRegion p
WHERE p.ID > 0
GROUP BY p.status
SELECT p.ID, COUNT(pos) AS positionsAmount
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 OR p.status = 'active'
GROUP BY p.ID
ORDER BY positionsAmount
SELECT p.status, SUM(p.ID)
FROM /exampleRegion p
WHERE p.ID > 0
GROUP BY p.status
SELECT p.ID, pos.secId, SUM(pos.mktValue)
FROM /exampleRegion p, p.positions.values pos
WHERE p.ID > 0 OR p.status = 'active'
GROUP BY p.ID, pos.secId
SELECT p.status, SUM(p.ID) as sm
FROM /exampleRegion p
WHERE p.ID > 0
GROUP BY p.status
ORDER BY sm DESC
SELECT p.ID, SUM(pos.mktValue) AS marketValue
FROM /exampleRegion p, p.positions.values pos
WHERE p.isActive()
GROUP BY p.ID
ORDER BY marketValue DESC
Using LIKE
SELECT * FROM /exampleRegion ps WHERE ps.pkid LIKE '_bc'
SELECT * FROM /exampleRegion ps WHERE ps.status LIKE '_b_' OR ps.pkid = '2'
SELECT * FROM /exampleRegion ps WHERE ps.status LIKE '%b%
Using Region Entry Keys and Values
SELECT * FROM /exampleRegion.keys k WHERE k.ID = 1
SELECT entry.value FROM /exampleRegion.entries entry WHERE entry.key = '1'
SELECT key, positions FROM /exampleRegion.entrySet, value.positions.values positions
WHERE positions.mktValue >= 25.00
SELECT DISTINCT entry.value FROM /exampleRegion.entries entry WHERE entry.key = '1'
SELECT * FROM /exampleRegion.entries entry WHERE entry.value.ID > 1
SELECT * FROM /exampleRegion.keySet key WHERE key = '1'
SELECT * FROM /exampleRegion.values portfolio
WHERE portfolio.status = 'active'
Nested Queries
IMPORT "query".Portfolio;
SELECT * FROM /exampleRegion, (SELECT DISTINCT * FROM /exampleRegion p TYPE Portfolio, p.positions
WHERE value!=null)
SELECT DISTINCT * FROM (SELECT DISTINCT * FROM /exampleRegion portfolios, positions pos)
WHERE pos.value.secId = 'IBM'
SELECT * FROM /exampleRegion portfolio
WHERE portfolio.ID IN (SELECT p2.ID FROM /exampleRegion2 p2 WHERE p2.ID > 1)
SELECT DISTINCT * FROM /exampleRegion p, (SELECT DISTINCT pos
FROM /exampleRegion x, x.positions.values pos WHERE x.ID = p.ID ) AS itrX
Query the results of a FROM clause expression
SELECT DISTINCT * FROM (SELECT DISTINCT * FROM /Portfolios ptf, positions pos) p
WHERE p.get('pos').value.secId = 'IBM'
Hash Map Query
Query using a hashmap. In the following examples, ‘version’ is one of the keys in the hashmap.
SELECT * FROM /exampleRegion p WHERE p['version'] = '1.0'
SELECT entry.key, entry.value FROM /exampleRegion.entries entry
WHERE entry.value['version'] = '100'
Map example where “map” is a nested HashMap object
SELECT DISTINCT * FROM /exampleRegion p WHERE p.portfolios['key2'] >= 3
Example Queries that Fetch Array Values
SELECT * FROM /exampleRegion p WHERE p.names[0] = 'aaa'
SELECT * FROM /exampleRegion p WHERE p.collectionHolderMap.get('1').arr[0] = '0'
Using ORDER BY (and ORDER BY with LIMIT)
You must use the DISTINCT keyword with ORDER BY queries.
SELECT DISTINCT * FROM /exampleRegion WHERE ID < 101 ORDER BY ID
SELECT DISTINCT * FROM /exampleRegion WHERE ID < 101 ORDER BY ID asc
SELECT DISTINCT * FROM /exampleRegion WHERE ID < 101 ORDER BY ID desc
SELECT DISTINCT key.ID, key.status AS st FROM /exampleRegion.keys key
WHERE key.status = 'inactive' ORDER BY key.status desc, key.ID LIMIT 1
SELECT DISTINCT * FROM /exampleRegion p ORDER BY p.getP1().secId, p.ID dec, p.ID LIMIT 9
SELECT DISTINCT * FROM /exampleRegion p ORDER BY p.ID, val.secId LIMIT 1
SELECT DISTINCT e.key FROM /exampleRegion.entrySet e ORDER BY e.key.ID desc, e.key.pkid desc
SELECT DISTINCT p.names[1] FROM /exampleRegion p ORDER BY p.names[1]
Join Queries
SELECT * FROM /exampleRegion portfolio1, /exampleRegion2 portfolio2
WHERE portfolio1.status = portfolio2.status
SELECT portfolio1.ID, portfolio2.status FROM /exampleRegion portfolio1, /exampleRegion2 portfolio2
WHERE portfolio1.status = portfolio2.status
SELECT * FROM /exampleRegion portfolio1, portfolio1.positions.values positions1,
/exampleRegion2 portfolio2, portfolio2.positions.values positions2 WHERE positions1.secId = positions1.secId
SELECT * FROM /exampleRegion portfolio1, portfolio1.positions.values positions1,
/exampleRegion2 portfolio2, portfolio2.positions.values positions2 WHERE portfolio1.ID = 1
AND positions1.secId = positions1.secId
SELECT DISTINCT a, b.price FROM /exampleRegoin1 a, /exampleRegion2 b WHERE a.price = b.price
Using AS
SELECT * FROM /exampleRegion p, p.positions.values AS pos WHERE pos.secId != '1'
Using TRUE
SELECT DISTINCT * FROM /Portfolios WHERE TRUE
Using IN and SET
See also IN and SET.
SELECT * FROM /exampleRegion portfolio WHERE portfolio.ID IN SET(1, 2)
SELECT * FROM /exampleRegion portfolio, portfolio.positions.values positions
WHERE portfolio.Pk IN SET ('1', '2') AND positions.secId = '1'
SELECT * FROM /exampleRegion portfolio, portfolio.positions.values positions
WHERE portfolio.Pk IN SET ('1', '2') OR positions.secId IN SET ('1', '2', '3')
SELECT * FROM /exampleRegion portfolio, portfolio.positions.values positions
WHERE portfolio.Pk IN SET ('1', '2') OR positions.secId IN SET ('1', '2', '3')
AND portfolio.status = 'active'
Querying for Set values
In the following query, sp is of type Set.
SELECT * FROM /exampleRegion WHERE sp = set('20', '21', '22')
If the Set (sp) only contains ‘20’ and ‘21’, then the query will evaluate to false. The query compares the two sets and looks for the presence of elements in both sets.
For other collection types like list (sp is of type List), the query can be written as follows:
SELECT * FROM /exampleRegion WHERE sp.containsAll(set('20', '21', '22'))
Invoking Methods on Objects
See Method Invocations for more information.
SELECT * FROM /exampleRegion p WHERE p.length > 1
SELECT DISTINCT * FROM /exampleRegion p WHERE p.positions.size >= 2
SELECT DISTINCT * FROM /exampleRegion p WHERE p.positions.isEmpty
SELECT DISTINCT * FROM /exampleRegion p WHERE p.name.startsWith('Bo')
Using Query-Level Debugging
To set debugging on the query level, add the <trace> keyword before the query. (If you are using an IMPORT statement, include it before the IMPORT).
<trace>
SELECT * from /exampleRegion, positions.values TYPE myclass
Using Reserved Words in Queries
To access any method, attribute, or named object that has the same name as a query language reserved word, enclose the name within double quotation marks.
SELECT * FROM /exampleRegion WHERE status = 'active' AND "type" = 'XYZ'
SELECT DISTINCT "type" FROM /exampleRegion WHERE status = 'active'
Using IMPORT
In the case where the same class name resides in two different namescopes (packages), there needs to be a means of referring to different classes of the same name. The IMPORT statement is used to establish a namescope for a class in a query.
IMPORT package.Position;
SELECT DISTINCT * FROM /exampleRegion, positions.values positions TYPE Position WHERE positions.mktValue >= 25.00
Using TYPE
Specifying object type helps the query engine to process the query at optimal speed. Apart from specifying the object types during configuration (using key-constraint and value-constraint), type can be explicitly specified in the query string.
SELECT DISTINCT * FROM /exampleRegion, positions.values positions TYPE Position WHERE positions.mktValue >= 25.00
Using ELEMENT
Using ELEMENT(expr) extracts a single element from a collection or array. This function throws a FunctionDomainException
if the argument is not a collection or array with exactly one element.
ELEMENT(SELECT DISTINCT * FROM /exampleRegion WHERE id = 'XYZ-1').status = 'active'
If you are querying a Java application’s local cache or querying other members, use org.apache.geode.cache.Cache.getQueryService
.
If you are writing a Java client to server query, use org.apache.geode.cache.client.Pool.getQueryService
.
To use a method in a query, use the attribute name that maps to the public method you want to invoke. For example:
/*valid method invocation*/
SELECT DISTINCT * FROM /exampleRegion p WHERE p.positions.size >= 2 - maps to positions.size()
No, you cannot invoke a static method on an object. For example, the following query is invalid.
/*invalid method invocation*/
SELECT DISTINCT * FROM /exampleRegion WHERE aDay = Day.Wednesday
To work around this limitation, write a reusable query that uses a query bind parameter to invoke the static method. Then at query run time, set the parameter to the static method invocation (Day.Wednesday
). For example:
SELECT DISTINCT * FROM /exampleRegion WHERE aDay = $1
Using query APIs, you can set query bind parameters that are passed values at query run time. For example:
// specify the query string
String queryString = "SELECT DISTINCT * FROM /exampleRegion p WHERE p.status = $1";
QueryService queryService = cache.getQueryService();
Query query = queryService.newQuery(queryString);
// set a query bind parameter
Object[] params = new Object[1];
params[0] = "active";
// Execute the query locally. It returns the results set.
SelectResults results = (SelectResults) query.execute(params);
// use the results of the query; this example only looks at the size
int size = results.size();
If you use a query bind parameter in place of a region path in your path expression, the parameter value must reference a collection (and not a String such as the name of the region path.)
See Using Query Bind Parameters for more details.
Determine whether your query’s performance will benefit from an index. For example, in the following query, an index on pkid can speed up the query.
SELECT DISTINCT * FROM /exampleRegion portfolio WHERE portfolio.pkid = '123'
An index can be created programmatically using APIs or by using xml. Here are two examples:
Sample Code
QueryService qs = cache.getQueryService();
qs.createIndex("myIndex", "status", "/exampleRegion");
qs.createKeyIndex("myKeyIndex", "id", "exampleRegion");
For more information about using this API, see the JavaDocs.
Sample XML
<region name="portfolios">
<region-attributes ...>
</region-attributes>
<index name="myIndex">
<functional from-clause="/exampleRegion" expression="status"/>
</index>
<index name="myKeyIndex">
<primary-key field="id"/>
</index>
<entry>
For more details about indexes, see Working with Indexes.
You can create indexes on overflow regions, but you are subject to some limitations. For example, the data contained in the index itself cannot be overflowed to disk. See Using Indexes with Overflow Regions for more information.
You can query partitioned regions, but there are some limitations. You cannot perform join queries on partitioned regions, however you can perform equi-join queries on colocated partitioned regions by executing a function on a local data set.
For a full list of restrictions, see Partitioned Region Query Restrictions.
If you know the data you need to query, you can target particular nodes in your queries (thus reducing the number of servers the query needs to access) by executing the query with the FunctionService. See Querying a Partitioned Region on a Single Node for details. If you are querying data that has been partitioned by a key or specific field, you should first create a key index and then execute the query using the FunctionService with the key or field as a filter. See Optimizing Queries on Data Partitioned by a Key or Field Value.
Supported elements | ||
---|---|---|
AND | LIMIT | TO_DATE |
AS | LIKE | TYPE |
COUNT | NOT | WHERE |
DISTINCT | NVL | |
ELEMENT | OR | |
FROM | ORDER BY | |
<HINT> | SELECT | |
IMPORT | SET | |
IN | <TRACE> | |
IS_DEFINED | TRUE | |
IS_UNDEFINED |
For more information and examples on using each supported keyword, see Supported Keywords.
You can debug a specific query at the query level by adding the <trace> keyword before the query string that you want to debug. Here is an example:
<trace> SELECT * FROM /exampleRegion
You can also write:
<TRACE> SELECT * FROM /exampleRegion
When the query is executed, Tanzu GemFire will log a message in $GEMFIRE_DIR/system.log
with the following information:
[info 2011/08/29 11:24:35.472 PDT CqServer <main> tid=0x1] Query Executed in 9.619656 ms; rowCount = 99;
indexesUsed(0) "select * from /exampleRegion"
If you want to enable debugging for all queries, you can enable query execution logging by setting a System property on the command line during start-up:
gfsh>start server --name=server_name -–J=-Dgemfire.Query.VERBOSE=true
Or you can set the property programmatically:
System.setProperty("gemfire.Query.VERBOSE","true");
If an implicit attribute or method name can only be associated with one untyped iterator, the Tanzu GemFire query processor will assume that it is associated with that iterator. However, if more than one untyped iterator is in scope, then the query will fail with a TypeMismatchException
. The following query fails because the query processor does not fully type expressions:
select distinct value.secId from /pos , getPositions(23)
The following query, however, succeeds because the iterator is either explicitly named with a variable or it is typed:
select distinct e.value.secId from /pos , getPositions(23) e
Using HINT indexname allows you to instruct the query engine to prefer and filter results from the specified indexes. If you provide multiple index names, the query engine will use all available indexes but prefer the specified indexes.
<HINT 'IDIndex'> SELECT * FROM /Portfolios p WHERE p.ID > 10 AND p.owner = 'XYZ'
<HINT 'IDIndex', 'OwnerIndex'> SELECT * FROM /Portfolios p WHERE p.ID > 10 AND p.owner = 'XYZ' AND p.value < 100
You can use the Java String class methods toUpperCase
and toLowerCase
to transform fields where you want to perform a case-insensitive search. For example:
SELECT entry.value FROM /exampleRegion.entries entry WHERE entry.value.toUpperCase LIKE '%BAR%'
or
SELECT * FROM /exampleRegion WHERE foo.toLowerCase LIKE '%bar%'