Matching pre-aggregations

Matching queries with pre-aggregations

When executing a query, Cube will try to match and fulfill it with the best available pre-aggregation.

Since pre-aggregations contain a condensed representation of the data from the upstream data source (rather than a copy of that data), Cube needs to ensure that fulfilling a query with a pre-aggregation is possible and doing so will produce correct results.

If there's no matching pre-aggregation, Cube will fall back to querying the upstream data source, unless the rollup-only mode is enabled.

If you don't know why a query doesn't match a pre-aggregation, check common pitfalls first.

Eligible pre-aggregations

Cube will search for matching pre-aggregations in all cubes that define members in the query.

Pre-aggregations are tested in the order they are defined in the data model file. However, rollup pre-aggregations are tested before original_sql pre-aggregations.

The first pre-aggregation that matches a query is be used.

Matching algorithm

Cube goes through the following steps to determine whether a query matches a particular eligible pre-aggregation:

See the details for each step:

  • Is query leaf-measure additive? Cube checks that all leaf measures in the query are additive. If the query contains calculated measures (e.g., measures defined as {sum} / {count}), then referenced leaf measures will be checked for additivity.
  • Does every member of the query exist in the pre-aggregation? Cube checks that the pre-aggregation contains all dimensions, filter dimensions, and leaf measures from the query.
  • Are any query measures multiplied in the cube's data schema? Cube checks if any measures are multiplied via a one_to_many relationship between cubes in the query.
  • Does the query specify granularity for its time dimension? Cube checks that the time dimension granularity is set in the query.
  • Are query filter dimensions included in its own dimensions? Cube checks that all filter dimensions are also included as dimensions in the query.
  • Does every member in the query exist in the pre-aggregation? Cube checks that the pre-aggregation contains all dimensions and measures used in the query.

Matching time dimensions

There are extra considerations that apply to matching time dimensions.

  • Time dimension and granularity in the query together act as a dimension. If the date range isn't aligned with granularity, a common granularity is used. This common granularity is selected using the greatest common divisor (opens in a new tab) across both the query and pre-aggregation. For example, the common granularity between hour and day is hour because both hour and day can be divided by hour.
  • The query's granularity's date range must match the start date and end date from time dimensions. For example, when using a granularity of month, the values should be the start and end days of the month, i.e., ['2020-01-01T00:00:00.000', '2020-01-31T23:59:59.999']; when the granularity is day, the values should be the start and end hours of the day, i.e., ['2020-01-01T00:00:00.000', '2020-01-01T23:59:59.999']. Date ranges are inclusive, and the minimum granularity is second. Use the allow_non_strict_date_range_match to allow a pre-aggregation to match a non-strict date range anyway.
  • The time zone in the query must match the time zone of a pre-aggregation. You can configure a list of time zones that pre-aggregations will be built for using the scheduled_refresh_time_zones configuration option.

Matching ungrouped queries

There are extra considerations that apply to matching ungrouped queries:

  • The pre-aggregation should include primary keys of all cubes involved in the query.
  • If multiple cubes are referenced in the query, the pre-aggregation should include only members of these cubes.

Common pitfalls