Joins
The joins
parameter declares a block to define relationships between cubes. It
allows users to access and compare fields from two or more cubes at the same
time.
cube('MyCube', {
joins: {
TargetCubeName: {
relationship: `belongsTo` || `hasMany` || `hasOne` || `belongs_to` || `has_many` || `has_one`,
sql: `SQL ON clause`,
},
},
});
All joins are generated as LEFT JOIN
. The cube which defines the join serves
as a main table, and any cubes referenced inside the joins
property are used
in the LEFT JOIN
clause. Learn more about direction of joins
here.
The semantics of INNER JOIN
can be achieved with additional filtering. For
example, a simple check of whether the column value IS NOT NULL
by using set
filter satisfies this requirement.
There's also no way to define FULL OUTER JOIN
and RIGHT OUTER JOIN
for the
sake of join modeling simplicity. To get RIGHT OUTER JOIN
semantics just
define join from other side of relationship.
The FULL OUTER JOIN
can be built inside cube sql
parameter. Quite frequently, FULL OUTER JOIN
is used to solve Data
Blending or similar problems. In that case, it's best
practice to have a separate cube for such an operation.
The name of a join should match the name of the external
cube. For example when a Products
cube is being
joined on to an Orders
cube, we would define the join as follows:
cube('Orders', {
joins: {
Products: {
relationship: `belongsTo`,
sql: `${CUBE.id} = ${Products.orderId}`,
},
},
});
relationship
enables you to describe the join relationship between joined
cubes. It’s important to properly define the type of relationship in order for
Cube to accurately calculate measures. The relationship does not need to be
defined on both cubes, but the definition can affect the join
direction.
It is very important to define the correct order of cubes in a join. It affects data in the result-set greatly. The basic cube represents the left entity in a join, all others would be right. That means that all rows of the left cube are selected, while rows of the right depend on the condition. For more information and specific examples, please read about join directions here.
The three possible values for a relationship are:
hasOne
A hasOne
relationship indicates a one-to-one connection with another cube.
This relationship indicates that the one row in the cube can match only one row
in the joined cube. For example, in a model containing users and user profiles,
the users cube would have the following join:
cube('Users', {
joins: {
Profile: {
relationship: `hasOne`,
sql: `${CUBE}.id = ${Profile.user_id}`,
},
},
});
hasMany
A hasMany
relationship indicates a one-to-many connection with another cube.
You'll often find this relationship on the "other side" of a belongsTo
relationship. This relationship indicates that the one row in the cube can match
many rows in the joined cube. For example, in a model containing authors and
books, the authors cube would have the following join:
cube('Authors', {
joins: {
Books: {
relationship: `hasMany`,
sql: `${CUBE}.id = ${Books.author_id}`,
},
},
});
belongsTo
A belongsTo
relationship indicates a many-to-one connection with another cube.
You’ll often find this relationship on the “other side” of a hasMany
relationship. This relationship indicates that the one row of the declaring cube
matches a row in the joined instance, while the joined instance can have many
rows in the declaring cube. For example, in a model containing orders and
customers, the orders cube would have the following join:
cube('Orders', {
joins: {
Customers: {
relationship: `belongsTo`,
sql: `${CUBE}.customer_id = ${Customers.id}`,
},
},
});
sql
is necessary to indicate a related column between cubes. It is important
to properly specify a matching column when creating joins. Take a look at the
example below:
cube('Orders', {
joins: {
Customers: {
relationship: `belongsTo`,
// The `customer_id` field on `Orders` corresponds to the
// `id` field on `Customers`
sql: `${CUBE}.customer_id = ${Customers.id}`,
},
},
});
In order for a join to work, it is necessary to define a primaryKey
as
specified below. It is a requirement when a join is defined so that Cube can
handle row multiplication issues.
Let's imagine you want to calculate Order Amount
by Order Item Product Name
.
In this case, Order
rows will be multiplied by the Order Item
join due to
the hasMany
relationship. In order to produce correct results, Cube will
select distinct primary keys from Order
first and then will join these primary
keys with Order
to get the correct Order Amount
sum result. Please note that
primaryKey
should be defined in the dimensions
section.
cube('Orders', {
dimensions: {
customerId: {
sql: `id`,
type: `number`,
primaryKey: true,
},
},
});
Setting primaryKey
to true
will change the default value of the shown
parameter to false
. If you still want shown
to be true
— set it manually.
cube('Orders', {
dimensions: {
customerId: {
sql: `id`,
type: `number`,
primaryKey: true,
shown: true,
},
},
});
If you don't have a single column in a cube's table that can act as a primary key, you can create a composite primary key as shown below.
The example uses Postgres string concatenation; note that SQL may be different depending on your database.
cube('Users', {
dimensions: {
id: {
sql: `${CUBE}.user_id || '-' || ${CUBE}.signup_week || '-' || ${CUBE}.activity_week`,
type: `string`,
primaryKey: true,
},
},
});
When you have several joined cubes, you should accurately use columns’ names to
avoid any mistakes. One way to make no mistakes is to use the ${CUBE}
reference. It allows you to specify columns’ names in cubes without any
ambiguity. During the implementation of the query, this reference will be used
as an alias for a basic cube. Take a look at the following example:
cube('Users', {
dimensions: {
name: {
sql: `${CUBE}.name`,
type: `string`,
},
},
});
Join graph is directed and A-B
join is different from B-A
. Learn more about
it here.
Cube automatically takes care of transitive joins. For example if you have the following schema:
cube(`A`, {
joins: {
B: {
sql: `${A}.b_id = ${B}.id`,
relationship: `belongsTo`,
},
},
measures: {
count: {
type: `count`,
},
},
});
cube(`B`, {
joins: {
C: {
sql: `${B}.c_id = ${C}.id`,
relationship: `belongsTo`,
},
},
});
cube(`C`, {
dimensions: {
category: {
sql: `category`,
type: `string`,
},
},
});
and the following query:
{
"measures": ["A.count"],
"dimensions": ["C.category"]
}
Joins A-B
and B-C
will be resolved automatically. Cube uses Dijkstra
algorithm to find join path between cubes given requested
members.
Did you find this page useful?