Documentation
DuckDB / MotherDuck

DuckDB

DuckDB (opens in a new tab) is an in-process SQL OLAP database management system, and has support for querying data in CSV, JSON and Parquet formats from an AWS S3-compatible blob storage. This means you can query data stored in AWS S3, Google Cloud Storage, or Cloudflare R2 (opens in a new tab). You can also use the CUBEJS_DB_DUCKDB_DATABASE_PATH environment variable to connect to a local DuckDB database.

Cube can also connect to MotherDuck (opens in a new tab), a cloud-based serverless analytics platform built on DuckDB. When connected to MotherDuck, DuckDB uses hybrid execution (opens in a new tab) and routes queries to S3 through MotherDuck for better performance.

Prerequisites

  • A set of IAM credentials which allow access to the S3-compatible data source. Credentials are only required for private S3 buckets.
  • The region of the bucket
  • The name of a bucket to query data from

Setup

Manual

Add the following to a .env file in your Cube project:

CUBEJS_DB_TYPE=duckdb

Cube Cloud

In Cube Cloud, select DuckDB when creating a new deployment and fill in the required fields:

Cube Cloud DuckDB Configuration Screen

If you are not using MotherDuck, leave the MotherDuck Token field blank.

You can also explore how DuckDB works with Cube if you create a demo deployment in Cube Cloud.

Environment Variables

Environment VariableDescriptionPossible ValuesRequired
CUBEJS_DB_DUCKDB_MEMORY_LIMITThe maximum memory limit for DuckDB. Equivalent to SET memory_limit=<MEMORY_LIMIT>. Default is 75% of available RAMA valid memory limit
CUBEJS_DB_DUCKDB_SCHEMAThe default search schema (opens in a new tab)A valid schema name
CUBEJS_DB_DUCKDB_MOTHERDUCK_TOKENThe service token to use for connections to MotherDuckA valid MotherDuck service token (opens in a new tab)
CUBEJS_DB_DUCKDB_DATABASE_PATHThe database filepath to use for connection to a local database.A valid duckdb database file path
CUBEJS_DB_DUCKDB_S3_ACCESS_KEY_IDThe Access Key ID to use for database connectionsA valid Access Key ID
CUBEJS_DB_DUCKDB_S3_SECRET_ACCESS_KEYThe Secret Access Key to use for database connectionsA valid Secret Access Key
CUBEJS_DB_DUCKDB_S3_ENDPOINTThe S3 endpointA valid S3 endpoint (opens in a new tab)
CUBEJS_DB_DUCKDB_S3_REGIONThe region of the bucket (opens in a new tab)A valid AWS region
CUBEJS_CONCURRENCYThe number of concurrent connections each queue has to the database. Default is 2A valid number
CUBEJS_DB_DUCKDB_S3_USE_SSLUse SSL for connectionA boolean
CUBEJS_DB_DUCKDB_S3_URL_STYLETo choose the S3 URL style(vhost or path)'vhost' or 'path'
CUBEJS_DB_DUCKDB_S3_SESSION_TOKENThe token for the S3 sessionA valid Session Token

Pre-Aggregation Feature Support

count_distinct_approx

Measures of type count_distinct_approx can be used in pre-aggregations when using DuckDB as a source database. To learn more about DuckDB's support for approximate aggregate functions, click here (opens in a new tab).

Pre-Aggregation Build Strategies

To learn more about pre-aggregation build strategies, head here.

FeatureWorks with read-only mode?Is default?
Batching
Export Bucket--

By default, DuckDB uses a batching strategy to build pre-aggregations.

Batching

No extra configuration is required to configure batching for DuckDB.

Export Bucket

DuckDB does not support export buckets.

SSL

Cube does not require any additional configuration to enable SSL as DuckDB connections are made over HTTPS.