Cube Core v1.5 — Performance, calendar cubes, SQL API over HTTP

Cube Core version 1.5 is the latest release to date. It includes, among other goodies, improvements to the performance of data model compilation, introduces calendar cubes in preview, and extends APIs with the new HTTP transport for the SQL API as well as explicit cache control options for REST API endpoints.

This release does not contain breaking changes. However, before upgrading, it is recommended that you check the following notes and adjust your data model and configuration if necessary:

Pre-aggregation matching with respect to join paths.
DuckDB v1.4.1 upgrade, affecting demo deployments in Cube Cloud.
NPM proxy settings no longer respected.

As always, we recommend testing the new version on a staging environment before deploying it to production.

New in data modeling

Performance boost to data model compilation

Cube provides a powerful data modeling layer that supports both YAML-based and JavaScript-based syntax, allows executing Python and JavaScript code and using Jinja templates. This enables programmatic generation, code reusability, and flexible multitenancy. However, it also complicates the process of data model compilation.

In this release, the performance of data model compilation has been significantly improved, with approximately 2-3x compilation speedup, lesser memory consumption, and better utilization of multiple CPU cores.

The performance increase is even more pronounced in Cube Cloud, which uses a different runtime for Cube, with up to 30x speedup for large data models.

You can read a full technical write-up about the performance optimization by Konstantin Burkalev in his blog.

Also, note that CUBEJS_TRANSPILATION_WORKER_THREADS environment variable is now set to true by default.

Improvements to views

View members can now override title, description, meta, and format parameters, allowing to customize the presentation of cube members in different views, as shown in the code example on GitHub.

View members can now be organized in nested folders, which complements the previously introduced support for non-nested, flat folders in views. While they are mostly useful for Microsoft Excel and Microsoft Power BI users in Cube Cloud, nested folders can also be used by custom front-end applications built on top of Cube Core. See the support for data modeling features in the documentation.

Also, note that you can still define nested folders even if your visualization tools do not support them. Cube provides two flattening mechanisms and the CUBEJS_NESTED_FOLDERS_DELIMITER environment variable to control them.

Improvements to YAML and Python support

We've streamlined imports in Python code, removing the necessity for tricks with PYTHONPATH. You can check a Python example in the documentation.

We've added support for {cube.sql()} syntax to YAML-based data models, allowing for better data model code reusability, especially when combined with extension. It also simplifies defining polymorphic cubes and using data blending.

Finally, we've supported escaping of curly braces in YAML-based data models, which makes it easier to use them with JSON literals in SQL queries and similar scenarios.

Improvements to Tesseract (preview)

Tesseract, the next-generation data modeling engine, is still in preview but has substantially expanded its capabilities. It is recommended to give it a try by setting the CUBEJS_TESSERACT_SQL_PLANNER environment variable to true. Please report on GitHub any issues you encounter to help us improve Tesseract further.

Tesseract now supports all major data sources. It also supports custom time dimension granularities and rolling window measures with custom granularities without an explicitly specified date range. Finally, Tesseract supports internal rate of return (XIRR) calculations and new switch dimensions and case measures (to be documented) for advanced data modeling use cases.

Calendar cubes feature, powered by Tesseract, has been introduced to provide a native way to define and use custom calendars, simplify time-shift calculations, and provide better control over time dimension granularities.

Please review the multi-stage calculations and calendar cubes documentation to familiarize yourself with Tesseract's capabilities.

New in data source support

Athena now supports providing a default database to use via the CUBEJS_DB_NAME environment variable. Also, Athena has received the custom granularities support.

Databricks now supports service principal access with OAuth, configured via the CUBEJS_DB_DATABRICKS_OAUTH_CLIENT_ID and CUBEJS_DB_DATABRICKS_OAUTH_CLIENT_SECRET environment variables. Also, Databricks connection checking has been improved to avoid waking up SQL warehouses unnecessarily.

DuckDB has been upgraded to version 1.4.1. Also, DuckDB now supports installing and loading DuckDB Community Extensions via the CUBEJS_DB_DUCKDB_COMMUNITY_EXTENSIONS environment variable. Finally, DuckDB can now use the default credential provider chain for S3 access when the CUBEJS_DB_DUCKDB_S3_USE_CREDENTIAL_CHAIN environment variable is set to true.

Note that the httpfs extension used by this DuckDB version now constructs S3 URLs differently, leading to the following errors in previously created demo deployments in Cube Cloud: Error: IO Error: Unknown error for HTTP HEAD to 'https://cube-tutorial.s3.us-east-1.amazonaws.com/orders.csv'. You can fix this issue by setting the AWS_DEFAULT_REGION environment variable to us-east-2 in these deployments.

Presto now supports export buckets on AWS S3 in addition to previously supported export buckets on GCS. Also, Presto has received the custom granularities support. Finally, custom authentication headers can now be specified via the new CUBEJS_DB_PRESTO_AUTH_TOKEN environment variable.

Snowflake can now use the new CUBEJS_DB_SNOWFLAKE_OAUTH_TOKEN environment variable to provide an OAuth token for authentication, complementing previously available CUBEJS_DB_SNOWFLAKE_OAUTH_TOKEN_PATH. It also works with deeply nested paths for export buckets, specified via CUBEJS_DB_EXPORT_BUCKET. Finally, the queryTag parameter can be passed to enable enhanced monitoring in Snowflake.

Trino now uses a custom Trino-specific connection check. Also, custom authentication headers can now be specified via the new CUBEJS_DB_PRESTO_AUTH_TOKEN environment variable.

Amazon Redshift, MS SQL Server, and SQLite drivers have got a few bug fixes.

New in APIs and client libraries

REST API

/v1/load and /v1/cubesql endpoints of the REST API now accept the new cache parameter that allows to specify the caching strategy that Cube should use when fulfilling requests. Related cache parameter has also been added to the JavaScript client library.

The REST API query format now supports specifying the granularity for time dimensions in the order property of the query. This is useful when the same time dimension is used multiple times in the query with different granularities.

Also, beforeOrOnDate and afterOrOnDate filter operators of the REST API have now been documented.

SQL API

The SQL API was complemented with a new HTTP transport, supported by the /v1/cubesql API endpoint. It allows to execute SQL API queries over HTTP and get a streaming response. This can be useful for custom front-end applications, embedded analytics scenarios, and AI-based experiences. Related cubeSql method has also been added to the JavaScript client library.

Finally, the SQL API has got various compatibility improvements, including better support for DBeaver, DataGrip, SQLAlchemy, Tableau, and other BI tools. It now supports cursors in the streaming mode and ignores transactions-related statements such as SAVEPOINT, ROLLBACK, and RELEASE.

New in pre-aggregations

The pre-aggregation matching algorithm has been improved to use the join tree structure of the query and pre-aggregations. Now pre-aggregations not only allow to specify join paths for their members, but also implement stricter checks that prevent incorrect pre-aggregation matching when a query has joins that are not covered by the pre-aggregation's join tree.

After the upgrade, some of the queries might stop using pre-aggregations due to stricter matching. It is recommended to review joins defined in the data model and add join paths to pre-aggregations if necessary.

New in query orchestrator

Some performance optimizations have been introduced to reduce the number of refresh key queries and improve the overall reliability of the query queue.

New in Cube Store

Various performance optimizations have been made to Cube Store, including debouncing information schema queries and reducing memory usage.

Also, a substantial work has been done to support common-table expressions (CTEs) and joins in Cube Store queries, enabling the use of pre-aggregations for multi-stage calculations powered by Tesseract. It is not included in the current release but will be released in the upcoming versions.

New in configuration

NPM proxy settings are no longer respected and only common HTTP_PROXY and/or HTTPS_PROXY environment variables are used.

New in documentation

The page about joins between cubes has been significantly expanded with new sections about join trees, join paths, and join hints. It is recommended to review this page to get better understanding of how joins work in Cube.

Version changes

This release also upgrades Node.js, used internally by Cube, to v22.20. Python is also upgraded to v3.15.5.

What’s next?

We're looking forward to your feedback on this release! Please join our Slack community to share your thoughts and ask any questions.

We are already working on the next release, which will include further improvements to Tesseract and Cube Store.

And don't hesitate to try the new release in our managed cloud offering, which provides a free plan.