Lambda Pre-Aggregations
Lambda pre-aggregations follow the Lambda architecture design to union real-time and batch data. Cube acts as a serving layer and uses pre-aggregations as a batch layer and source data or other pre-aggregations, usually streaming, as a speed layer.
Lambda pre-aggregations only work with Cube Store.
Additionally, we’re going to remove support for external storages, other than Cube Store, later this year. Cube Store will replace Redis and, therefore will be a required component to run Cube even without pre-aggregations.
Below we are looking at the most common examples of using lambda pre-aggregations.
Batch and source data
Batch data is coming from pre-aggregation and real-time data is coming from the data source.

First, you need to create pre-aggregations that will contain your batch data. In
the following example, we call it batch. Please note, it must have
timeDimension
, and Cube will use it to union batch data with source data.
You control the batch part of your data with buildRangeStart
and
buildRangeEnd
properties of pre-aggregation to determine specific window for
your batched data.
Next, you need to create a lambda pre-aggregation. To do that, create
pre-aggregation with type rollupLambda
, specify rollups you would like to use
with rollups
property, and finally set unionWithSourceData: true
to use
source data as a real-time layer.
Please make sure that the lambda pre-aggregation definition comes first when defining your pre-aggregations.
cube('Users', {
...,
preAggregations: {
lambda: {
type: `rollupLambda`,
unionWithSourceData: true,
rollups: [Users.batch]
},
batch: {
measures: [Users.count],
dimensions: [Users.name],
timeDimension: Users.createdAt,
granularity: `day`,
buildRangeStart: {
sql: `SELECT '2020-01-01'`
},
buildRangeEnd: {
sql: `SELECT '2022-05-30'`
}
}
},
})
Batch and streaming data
In this scenario, batch data is comes from one pre-aggregation and real-time data comes from a streaming pre-aggregation.

You can use lambda pre-aggregations to combine data from multiple pre-aggregation, where one pre-aggregation can have batch data and another streaming.
// This cube uses a streaming SQL data source such as ksqlDB
cube('StreamingUsers', {
...,
dataSource: 'ksql',
});
// This cube uses a data source such as Clickhouse or BigQuery
cube('Users', {
...,
preAggregations: {
batchStreamingLambda: {
type: `rollupLambda`,
rollups: [Users.batch, streaming]
},
batch: {
type: `rollup`,
measures: [Users.count],
dimensions: [Users.name],
timeDimension: Users.createdAt,
granularity: `day`,
buildRangeStart: {
sql: `SELECT '2020-01-01'`
},
buildRangeEnd: {
sql: `SELECT '2022-05-30'`
}
},
streaming: {
type: `rollup`,
measures: [StreamingUsers.count],
dimensions: [StreamingUsers.name],
timeDimension: StreamingUsers.createdAt,
granularity: `day`
}}
},
)
Did you find this page useful?