This topic explains how to use indexes on equi-join queries using multiple regions in VMware Tanzu GemFire.
To query across multiple regions, identify all equi-join conditions. Then, create as few indexes for the equi-join conditions as you can while still joining all regions.
If there are equi-join conditions that redundantly join two regions (in order to more finely filter the data, for example), then creating redundant indexes for these joins will negatively impact performance. Create indexes only on one equi-join condition for each region pair.
In this example query:
SELECT DISTINCT *
FROM /investors inv, /securities sc, /orders or,
inv.ordersPlaced inv_op, or.securities or_sec
WHERE inv_op.orderID = or.orderID
AND or_sec.secID = sc.secID
All conditions are required to join the regions, so you would create four indexes, two for each equi-join condition:
FROM clause | Indexed expression |
---|---|
/investors inv, inv.ordersPlaced inv_op | inv_op.orderID |
/orders or, or.securities or_sec | or.orderID |
FROM clause | Indexed expression |
---|---|
/orders or, or.securities or_sec | or_sec.secID |
/securities sc | sc.secID |
Adding another condition to the example:
SELECT DISTINCT *
FROM /investors inv, /securities sc, /orders or,
inv.ordersPlaced inv_op, or.securities or_sec, sc.investors sc_invs
WHERE inv_op.orderID = or.orderID
AND or_sec.secID = sc.secID
AND inv.investorID = sc_invs.investorID
You would still only want to use four indexes in all, as that’s all you need to join all of the regions. You would need to choose the most performant two of the following three index pairs:
FROM clause | Indexed expression |
---|---|
/investors inv, inv.ordersPlaced inv_op | inv_op.orderID |
/orders or, or.securities or_sec | or.orderID |
FROM clause | Indexed expression |
---|---|
/orders or, or.securities or_sec | or_sec.secID |
/securities sc, sc.investors sc_invs | sc.secID |
FROM clause | Indexed expression |
---|---|
/investors inv, inv.ordersPlaced inv_op | inv.investorID |
/securities sc, sc.investors sc_invs | sc_invs.investorID |
The most performant set is that which narrows the data to the smallest result set possible. Examine your data and experiment with the three index pairs to see which provides the best performance.