Calculated measures and dimensions
Often, dimensions are mapped to table columns and measures are defined as aggregations of top of table columns. However, measures and dimensions can also reference other members of the same or other cubes, use SQL expressions, and perform calculations involving other measures and dimensions.
Most common patterns are known as calculated measures, proxy dimensions, and subquery dimensions.
Calculated measures
Calculated measures perform calculations on other measures using SQL functions and operators. They provide a way to decompose complex measures (e.g., ratios or percents) into formulas that involve simpler measures. Also, calculated measures can help to use non-additive measures with pre-aggregations.
In the following example, the completed_ratio
measure is calculated as a division of
completed_count
by total count
. Note that the result is also multiplied by 1.0
since integer division in SQL (opens in a new tab) would otherwise produce an
integer value.
cubes:
- name: orders
sql: >
SELECT 1 AS id, 'processing' AS status UNION ALL
SELECT 2 AS id, 'completed' AS status UNION ALL
SELECT 3 AS id, 'completed' AS status
measures:
- name: count
type: count
- name: completed_count
type: count
filters:
- sql: "{CUBE}.status = 'completed'"
- name: completed_ratio
sql: "1.0 * {completed_count} / {count}"
type: number
If you query for completed_ratio
, Cube will generate the following SQL:
SELECT
1.0 * COUNT(
CASE WHEN ("orders".status = 'completed') THEN 1 END
) / COUNT(*) "orders__completed_ratio"
FROM (
SELECT 1 AS id, 'processing' AS status UNION ALL
SELECT 2 AS id, 'completed' AS status UNION ALL
SELECT 3 AS id, 'completed' AS status
) AS "orders"
Proxy dimensions
Proxy dimensions reference dimensions from other cubes. Proxy dimensions provide a way to reference some dimensions as if they were defined in multiple cubes.
If you have first_cube
that is joined to second_cube
, you can use a
proxy dimension to bring second_cube.dimension
to first_cube
as dimension
(or
under a different name). When you query for a proxy dimension, Cube will transparently
generate SQL with necessary joins.
In the following example, orders.user_name
is a proxy dimension that brings the
users.name
dimension to orders
. You can also see that there's a join relationship
between orders
and users
:
cubes:
- name: orders
sql: >
SELECT 1 AS id, 1 AS user_id UNION ALL
SELECT 2 AS id, 1 AS user_id UNION ALL
SELECT 3 AS id, 2 AS user_id
dimensions:
- name: id
sql: id
type: number
primary_key: true
- name: user_name
sql: "{users.name}"
type: string
measures:
- name: count
type: count
joins:
- name: users
sql: "{users}.id = {orders}.user_id"
relationship: one_to_many
- name: users
sql: >
SELECT 1 AS id, 'Alice' AS name UNION ALL
SELECT 2 AS id, 'Bob' AS name
dimensions:
- name: name
sql: name
type: string
If you query for orders.user_name
and orders.count
, Cube will generate the
following SQL:
SELECT
"users".name "orders__user_name",
COUNT(DISTINCT "orders".id) "orders__count"
FROM (
SELECT 1 AS id, 1 AS user_id UNION ALL
SELECT 2 AS id, 1 AS user_id UNION ALL
SELECT 3 AS id, 2 AS user_id
) AS "orders"
LEFT JOIN (
SELECT 1 AS id, 'Alice' AS name UNION ALL
SELECT 2 AS id, 'Bob' AS name
) AS "users" ON "users".id = "orders".user_id
GROUP BY 1
Note that if you query for orders.user_name
only, Cube will figure out that it's
equivalent to querying just users.name
and there's no need to generate a join in SQL:
SELECT
"users".name "orders__user_name"
FROM (
SELECT 1 AS id, 'Alice' AS name UNION ALL
SELECT 2 AS id, 'Bob' AS name
) AS "users"
GROUP BY 1
Subquery dimensions
Subquery dimensions reference measures from other cubes. Subquery dimensions provide a way to define measures that aggregate values of other measures. They can be useful to calculate nested aggregates.
If you have first_cube
that is joined to second_cube
, you can use a
subquery dimension to bring second_cube.measure
to first_cube
as dimension
(or
under a different name). When you query for a subquery dimension, Cube will
transparently generate SQL with necessary joins. It works as a correlated
subquery (opens in a new tab) but is implemented via joins for optimal
performance and portability.
In the following example, users.order_count
is a subquery dimension that brings the
orders.count
measure to users
. Note that the sub_query
parameter
is set to true
on users.order_count
. You can also see that there's a join
relationship between orders
and users
:
cubes:
- name: orders
sql: >
SELECT 1 AS id, 1 AS user_id UNION ALL
SELECT 2 AS id, 1 AS user_id UNION ALL
SELECT 3 AS id, 2 AS user_id
dimensions:
- name: id
sql: id
type: number
primary_key: true
measures:
- name: count
type: count
joins:
- name: users
sql: "{users}.id = {orders}.user_id"
relationship: one_to_many
- name: users
sql: >
SELECT 1 AS id, 'Alice' AS name UNION ALL
SELECT 2 AS id, 'Bob' AS name
dimensions:
- name: id
sql: id
type: number
primary_key: true
- name: name
sql: name
type: string
- name: order_count
sql: "{orders.count}"
type: number
sub_query: true
measures:
- name: avg_order_count
sql: "{order_count}"
type: avg
You can reference subquery dimensions in measures just like usual dimensions. In the
example above, the avg_order_count
measure performs an aggregation on order_count
.
If you query for users.name
and users.order_count
, Cube will generate the
following SQL:
SELECT
"users".name "users__name",
"users__order_count" "users__order_count"
FROM (
SELECT 1 AS id, 'Alice' AS name UNION ALL
SELECT 2 AS id, 'Bob' AS name
) AS "users"
LEFT JOIN (
SELECT
"users_order_count_subquery__users".id "users__id",
count(distinct "users_order_count_subquery__orders".id) "users__order_count"
FROM (
SELECT 1 AS id, 1 AS user_id UNION ALL
SELECT 2 AS id, 1 AS user_id UNION ALL
SELECT 3 AS id, 2 AS user_id
) AS "users_order_count_subquery__orders"
LEFT JOIN (
SELECT 1 AS id, 'Alice' AS name UNION ALL
SELECT 2 AS id, 'Bob' AS name
) AS "users_order_count_subquery__users" ON "users_order_count_subquery__users".id = "users_order_count_subquery__orders".user_id
GROUP BY 1
) AS "users_order_count_subquery" ON "users_order_count_subquery"."users__id" = "users".id
GROUP BY 1, 2