Deployment types

Cube Cloud provides you with three deployment types:

Development instance — designed for development use cases.
Production cluster — designed for production workloads and high-availability.
Production multi-cluster — designed for demanding production workloads, high-scalability, high-availaility, and advanced multi-tenancy configurations.

Development instance

Development instance is available in Cube Cloud for free, no credit card required. Your free trial is limited to 2 development instances and only 1,000 queries per day. Upgrade to any paid product tier (opens in a new tab) to unlock all features.

Development instances are designed for development use cases only. This makes it easy to get started with Cube Cloud quickly, and also allows you to build and query pre-aggregations on-demand.

Development instances don't have dedicated refresh workers and, consequently, they do not refresh pre-aggregations on schedule.

Development instances do not provide high-availability nor do they guarantee fast response times. Development instances also auto-suspend after 30 minutes of inactivity, which can cause the first request after the instance wakes up to take additional time to process. They also have limits on the maximum number of queries per day and the maximum number of Cube Store Workers. We strongly advise not using a development instance in a production environment, it is for testing and learning about Cube only and will not deliver a production-level experience for your users.

You can try a Cube Cloud development instance by signing up for Cube Cloud (opens in a new tab) to try it free (no credit card required).

Production cluster

Production cluster is available in Cube Cloud on all paid product tiers (opens in a new tab). You can also choose a deployment tier.

Production Clusters are designed to support high-availability production workloads. It consists of several key components, including starting with 2 Cube API instances, 1 Cube Refresh Worker and 2 Cube Store Routers - all of which run on dedicated infrastructure. The cluster can automatically scale to meet the needs of your workload by adding more components as necessary; check the page on scalability to learn more.

Production multi-cluster

Production multi-cluster deployments are designed for demanding production workloads, high-scalability, high-availaility, and large multi-tenancy configurations, e.g., with more than 100 tenants.

Production multi-cluster is available in Cube Cloud on Premium and above (opens in a new tab) product tiers.

It provides you with two options:

Scale the number of production cluster deployments serving your workload, allowing to route requests over up to 10 production clusters and up to 100 API instances.
Optionally, scale the number of Cube Store routers, allowing for increased Cube Store querying performance.

High-level architecture diagram of a Cube Cloud Production Multi-Cluster

Each production cluster is billed separately, and all production clusters can use auto-scaling to match demand.

Configuring production multi-cluster

To switch your Cube Cloud deployment to production multi-cluster, navigate to Settings → General, select it under Type, and confirm with ✓:

To set the number of production clusters within your production multi-cluster deployment, navigate to Settings → Configuration and edit Number of clusters.

Routing traffic between production clusters

Cube Cloud routes requests between multiple production clusters within a production multi-cluster deployment based on context_to_app_id. In most cases, it should return an identifier that does not change over time for each tenant.

The following implementation will make sure that all requests from a particular tenant are always routed to the same production cluster. This approach ensures that only one production cluster keeps compiled data model cache for each tenant and serves its requests. It allows to reduce the footprint of the compiled data model cache on individual production clusters.

Python

JavaScript

from cube import config
 
@config('context_to_app_id')
def context_to_app_id(ctx: dict) -> str:
  return f"CUBE_APP_{ctx['securityContext']['tenant_id']}"

If your implementation of context_to_app_id returns identifiers that change over time for each tenant, requests from one tenant would likely hit multiple production clusters and you would not have the benefit of reduced memory footprint. Also you might see 502 or timeout errors in case of different cluster nodes would return different context_to_app_id results for the same request.

Switching between deployment types

To switch a deployment's type, go to the deployment's Settings screen and select from the available options:

Cube Cloud Deployment Settings page showing Development Instance, Production Cluster, and Production Multi-Cluster options

Deployments Continuous deployment