Joins
The Cube SQL API currently does not support INNER JOIN
, LEFT JOIN
,
RIGHT JOIN
and FULL OUTER JOIN
. We plan to support these types of joins in
future releases.
The SQL API supports CROSS JOIN
s between cubes. When a CROSS JOIN
is being
processed by Cube, it generates the correct joining conditions for the
underlying data source.
For example, the following query joins the Orders
and Products
tables under
the hood with Orders.product_id = Products.id
, exactly the same way as the
REST API query does:
cube=> SELECT * FROM Orders CROSS JOIN Products LIMIT 5;
count | avgValue | totalValue | number | value | status | createdAt | completedAt | __user | count | name | description | createdAt | __user
-------+----------+------------+--------+-------+-----------+----------------------------+----------------------------+--------+-------+---------------------------+------------------------------------+----------------------------+--------
1 | 20 | 20 | 40 | 20 | completed | 2020-10-26 00:00:00.000000 | 2020-11-07 00:00:00.000000 | | 1 | Incredible Fresh Chicken | Electronics Generic Fresh Computer | 2020-06-29 00:00:00.000000 |
1 | 20 | 20 | 14 | 20 | completed | 2021-02-07 00:00:00.000000 | 2021-03-02 00:00:00.000000 | | 1 | Unbranded Wooden Mouse | Outdoors Incredible Rubber Car | 2019-07-16 00:00:00.000000 |
1 | 20 | 20 | 23 | 20 | completed | 2022-07-23 00:00:00.000000 | 2022-08-11 00:00:00.000000 | | 1 | Handcrafted Plastic Chair | Electronics Sleek Rubber Tuna | 2021-02-27 00:00:00.000000 |
1 | 20 | 20 | 86 | 20 | completed | 2023-04-19 00:00:00.000000 | 2023-04-25 00:00:00.000000 | | 1 | Practical Metal Chicken | Toys Awesome Frozen Chips | 2020-07-24 00:00:00.000000 |
1 | 20 | 20 | 27 | 20 | completed | 2019-06-27 00:00:00.000000 | 2019-07-21 00:00:00.000000 | | 1 | Sleek Rubber Chair | Computers Refined Cotton Shirt | 2021-09-26 00:00:00.000000 |
(5 rows)
In the resulting query plan, you won't see any joins as you can't see those for REST API queries either:
cube=> EXPLAIN SELECT * FROM Orders CROSS JOIN Products LIMIT 5;
plan_type | plan
---------------+-----------------------------
logical_plan | CubeScan: request={ +
| "measures": [ +
| "Orders.count", +
| "Orders.avgValue", +
| "Orders.totalValue", +
| "Orders.number", +
| "Products.count" +
| ], +
| "dimensions": [ +
| "Orders.value", +
| "Orders.status", +
| "Orders.createdAt", +
| "Orders.completedAt", +
| "Products.name", +
| "Products.description",+
| "Products.createdAt" +
| ], +
| "segments": [], +
| "limit": 5 +
| }
physical_plan | CubeScanExecutionPlan +
|
(2 rows)
This feature allows you to CROSS JOIN
cubes even with transitive joins only.
Typically, in tools that allow defining custom SQL datasets, you'd use joined tables as a dataset SQL. For example:
SELECT o.count as count, p.name as product_name, p.description as product_description
FROM Orders o
CROSS JOIN Products p;
Please note we use aliasing to avoid name clashing between cube members in a resulting data set. In this case, wrapped SQL will be properly processed by Cube, pushing down all operations to Cube query:
cube=> SELECT product_name, SUM(count) FROM (
SELECT o.count as count, p.name as product_name, p.description as product_description
FROM Orders o CROSS
JOIN Products p
) joined
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;
product_name | SUM(joined.count)
--------------------------+-------------------
Tasty Plastic Mouse | 121
Intelligent Cotton Ball | 119
Ergonomic Steel Tuna | 116
Intelligent Rubber Pants | 116
Generic Wooden Gloves | 116
(5 rows)
We can see this by introspecting the EXPLAIN
plan for this query:
cube=> EXPLAIN SELECT product_name, SUM(count) FROM (
SELECT o.count as count, p.name as product_name, p.description as product_description
FROM Orders o
CROSS JOIN Products p
) joined
GROUP BY 1
ORDER BY 2 DESC
LIMIT 5;
plan_type | plan
---------------+-----------------------
logical_plan | CubeScan: request={ +
| "measures": [ +
| "Orders.count" +
| ], +
| "dimensions": [ +
| "Products.name" +
| ], +
| "segments": [], +
| "order": [ +
| [ +
| "Orders.count",+
| "desc" +
| ] +
| ], +
| "limit": 5 +
| }
physical_plan | CubeScanExecutionPlan+
|
(2 rows)
Please note even if product_description
is in the inner selection, it isn't
evaluated in the final query as it isn't used in any way.
As an alternative to achieve joins it is also possible to define proxy dimension or measure inside the Cube.
cube(`Orders`, {
sql: `SELECT * FROM public.orders`,
joins: {
Users: {
relationship: `belongsTo`,
sql: `${CUBE}.user_id = ${Users}.id`,
},
},
measures: {
count: {
type: `count`,
},
},
dimensions: {
id: {
sql: `id`,
type: `number`,
primaryKey: true,
},
// this is proxy dimension
user_city: {
sql: `${Users.city}`,
type: `string`,
},
},
});
cube(`Users`, {
sql: `SELECT * FROM public.users`,
measures: {},
dimensions: {
id: {
sql: `id`,
type: `number`,
primaryKey: true,
},
city: {
sql: `city`,
type: `string`,
},
},
});
Now, it is possible to get orders count by users city with the following query.
cube=> SELECT count, user_city FROM Orders;
count | user_city
-------+---------------
9524 | New York
9408 | San Francisco
6360 | Mountain View
6262 | Seattle
4393 | Los Angeles
3183 | Chicago
3060 | Austin
1804 | Palo Alto
(8 rows)
Did you find this page useful?