Documentation
Google BigQuery

Google BigQuery

Prerequisites

In order to connect Google BigQuery to Cube, you need to provide service account credentials. Cube requires the service account to have BigQuery Data Viewer and BigQuery Job User roles enabled. You can learn more about acquiring Google BigQuery credentials here (opens in a new tab).

Setup

Manual

Add the following to a .env file in your Cube project:

CUBEJS_DB_TYPE=bigquery
CUBEJS_DB_BQ_PROJECT_ID=my-bigquery-project-12345
CUBEJS_DB_BQ_KEY_FILE=/path/to/my/keyfile.json

You could also encode the key file using Base64 and set the result to CUBEJS_DB_BQ_CREDENTIALS:

CUBEJS_DB_BQ_CREDENTIALS=$(cat /path/to/my/keyfile.json | base64)

Cube Cloud

ℹ️Allowing connections from Cube Cloud IP

In some cases you'll need to allow connections from your Cube Cloud deployment IP address to your database. You can copy the IP address from either the Database Setup step in deployment creation, or from Settings → Configuration in your deployment.

The following fields are required when creating a BigQuery connection:

Cube Cloud BigQuery Configuration Screen

Cube Cloud also supports connecting to data sources within private VPCs. If you already have VPCs enabled in your account, check out the VPC documentation to learn how to get started.

VPC connectivity is available in Cube Cloud on Premium (opens in a new tab) tier. Contact us (opens in a new tab) for details.

Environment Variables

Environment VariableDescriptionPossible ValuesRequired
CUBEJS_DB_BQ_PROJECT_IDThe Google BigQuery project ID to connect toA valid Google BigQuery Project ID
CUBEJS_DB_BQ_KEY_FILEThe path to a JSON key file for connecting to Google BigQueryA valid Google BigQuery JSON key file
CUBEJS_DB_BQ_CREDENTIALSA Base64 encoded JSON key file for connecting to Google BigQueryA valid Google BigQuery JSON key file encoded as a Base64 string
CUBEJS_DB_BQ_LOCATIONThe Google BigQuery dataset location to connect to. Required if used with pre-aggregations outside of US. If not set then BQ driver will fail with Dataset was not found in location US errorA valid Google BigQuery regional location (opens in a new tab)⚠️
CUBEJS_DB_EXPORT_BUCKETThe name of a bucket in cloud storageA valid bucket name from cloud storage
CUBEJS_DB_EXPORT_BUCKET_TYPEThe cloud provider where the bucket is hostedgcp
CUBEJS_CONCURRENCYThe number of concurrent connections each queue has to the database. Default is 10A valid number
CUBEJS_DB_MAX_POOLThe maximum number of concurrent database connections to pool. Default is 40A valid number

Pre-Aggregation Feature Support

count_distinct_approx

Measures of type count_distinct_approx can be used in pre-aggregations when using Google BigQuery as a source database. To learn more about Google BigQuery's support for approximate aggregate functions, click here (opens in a new tab).

Pre-Aggregation Build Strategies

To learn more about pre-aggregation build strategies, head here.

FeatureWorks with read-only mode?Is default?
Batching
Export Bucket

By default, Google BigQuery uses batching to build pre-aggregations.

Batching

No extra configuration is required to configure batching for Google BigQuery.

Export bucket

BigQuery only supports using Google Cloud Storage for export buckets.

Google Cloud Storage

For improved pre-aggregation performance with large datasets, enable export bucket functionality by configuring Cube with the following environment variables:

When using an export bucket, remember to assign the BigQuery Data Editor and Storage Object Admin role to your BigQuery service account.

CUBEJS_DB_EXPORT_BUCKET=export_data_58148478376
CUBEJS_DB_EXPORT_BUCKET_TYPE=gcp

SSL

Cube does not require any additional configuration to enable SSL as Google BigQuery connections are made over HTTPS.