Auto-suspension is available in Cube Cloud on Starter and above tiers (opens in a new tab).
Cube Cloud can automatically suspend deployments when not in use to prevent resource consumption when infrastructure is not being actively used, which helps manage spend and preventing unnecessary quota use.
This is useful for deployments that are not used 24/7, such as staging deployments. Auto-suspension will hibernate the deployment when no API requests are received after a period of time, and automatically resume the deployment when API requests start coming in again:
Development Instances are auto-suspended automatically when not in use for 10 minutes, whereas Production Clusters and Production Multi-Clusters can auto-suspend after no API requests were received within a configurable time period. While suspended, pre-aggregation builds will also be paused to prevent unnecessary resource consumption.
During auto-suspension, resources are monitored in 5 minute intervals. This means that if a deployment was suspended 4 minutes ago, and a request comes in, the deployment will resume immediately and 5 minute of CCU usage will be billed.
To configure auto-suspension settings, navigate to the Settings screen in your deployment and click the Configuration tab, then ensure Enable Auto-suspend is turned on:
To configure how long Cube Cloud should wait before suspending the deployment, adjust Auto-suspend threshold (minutes) to the desired value and click Apply:
The Cube API instances will temporarily become unavailable while they are configured; this usually takes less than a minute.
To resume a suspended deployment, send a query to Cube using the API or by navigating to the deployment in Cube Cloud.
Currently, Cube Cloud's auto-suspension feature cannot guarantee a 100% resume rate on the first query or a specific time frame for resume. While in most cases, deployment resumes within several seconds of the first query, there is still a possibility that it may take longer to resume your deployment. This can potentially lead to an error response code for the initial query.
Deployments typically resume in under 30 seconds, but can take significantly longer in certain situations depending on two major factors:
- Data model: How many cubes and views are defined.
- Query complexity: How complicated the queries being sent to the API are
Complex data models take more time to compile, and complex queries can cause response times to be significantly longer than usual.