Documentation
Data Modeling
Advanced
Dynamic data models with Jinja and Python

Dynamic data models with Jinja and Python

Cube supports authoring data models using the Jinja templating language (opens in a new tab). This allows de-duplicating common patterns in your data models as well as dynamically generating data models from a remote data source.

This feature is in Public Preview and is subject to change. We'd love to hear your feedback on this feature, please post an issue on GitHub.

To use a Jinja template, create a file in your data model folder with the .jinja extension. For example, a file containing the orders cube could be named orders.yml.jinja under the models/ directory.

Loops

Jinja supports looping (opens in a new tab) over lists and dictionaries. In the following example, we loop over a list of nested properties and generate a LEFT JOIN UNNEST clause for each one: for each one:

{%- set nested_properties = [
  "referrer",
  "href",
  "host",
  "pathname",
  "search"
] -%}
 
cubes:
  - name: analytics
    sql: >
      SELECT
      {%- for prop in nested_properties %}
        {{ prop }}_prop.value AS {{ prop }}
      {%- endfor %}
      FROM public.events
      {%- for prop in nested_properties %}
      LEFT JOIN UNNEST(properties) AS {{ prop }}_prop ON {{ prop }}_prop.key = '{{ prop }}'
      {%- endfor %}

Another useful pattern is to loop over a dictionary of values and generate a measure for each one, as in the following example:

{%- set metrics = {
  "mau": 30,
  "wau": 7,
  "day": 1
} %}
 
cubes:
  - name: orders
    sql_table: public.orders
 
    measures:
      {%- for name, days in metrics.items() %}
      - name: {{ name }}
        type: count_distinct
        sql: user_id
        rolling_window:
          trailing: {{ days }} day
          offset: start
      {% endfor %}

Macros

Cube data models also support Jinja macros, which allow you to define reusable snippets of code. You can read more about macros in the Jinja documentation (opens in a new tab).

In the following example, we define a macro called dimension() which generates a dimension definition in Cube. This macro is then invoked multiple times to generate multiple dimensions:

{# Declare the macro before using it, otherwise Jinja will throw an error. #}
{%- macro dimension(column_name, type='string', primary_key=False) -%}
      - name: {{ column_name }}
        sql: {{ column_name }}
        type: {{ type }}
        {% if primary_key -%}
        primary_key: true
        {% endif -%}
{% endmacro -%}
 
cubes:
  - name: orders
    sql_table: public.orders
 
    dimensions:
      {{ dimension('id', 'number', primary_key=True) }}
      {{ dimension('status') }}
      {{ dimension('created_at', 'time') }}
      {{ dimension('completed_at', 'time') }}

You could also use macros to generate SQL snippets for use in the sql property:

{%- macro cents_to_dollars(column_name, precision=2) -%}
  ({{ column_name }} / 100)::NUMERIC(16, {{ precision }})
{%- endmacro -%}
 
cubes:
  - name: payments
    sql: >
      SELECT
        id AS payment_id,
        {{ cents_to_dollars('amount') }} AS amount_usd
      FROM app_data.payments

Calling Python functions

You can declare and invoke Python functions from within a Jinja template. This allows the reuse of existing code to generate data models.

In the following example, we'll create a new file called utils.py, and within it, declare a function called load_data() which will load data from a remote API endpoint. We will then use the function to generate a data model in Cube.

# Note that we decorate this function with the `context_func` decorator.
# This is required in order for data models to be able to call this function.
@context_func
def load_data():
   client = MyApiClient("example.com")
   return client.load_data()
 
 
class MyApiClient:
  def __init__(self, api_url):
    self.api_url = api_url
 
  # mock API call
  def load_data(self):
    api_response = {
      "cubes": [
        {
          "name": "cube_from_api",
          "measures": [
            { "name": "count", "type": "count" },
            { "name": "total", "type": "sum", "sql": "amount" }
          ],
          "dimensions": []
        },
        {
          "name": "cube_from_api_with_dimensions",
          "measures": [
            { "name": "active_users", "type": "count_distinct", "sql": "user_id" }
          ],
          "dimensions": [
            { "name": "city", "sql": "city_column", "type": "string" }
          ]
        }
      ]
    }
    return api_response

Now that we've decorated our function with the @context_func decorator, we can call it from within a Jinja template. In the following example, we'll call the load_data() function and use the result to generate a data model.

cubes:
  {# Here we use the decorated function from earlier #}
  {%- for cube in load_data()["cubes"] %}
 
  - name: {{ cube.name }}
 
  {%- if cube.measures is not none and cube.measures|length > 0 %}
    measures:
      {%- for measure in cube.measures %}
      - name: {{ measure.name }}
        type: {{ measure.type }}
      {%- if measure.sql %}
        sql: {{ measure.sql }}
      {%- endif %}
      {%- endfor %}
  {%- endif %}
 
  {%- if cube.dimensions is not none and cube.dimensions|length > 0 %}
    dimensions:
      {%- for dimension in cube.dimensions %}
      - name: {{ dimension.name }}
        type: {{ dimension.type }}
        sql: {{ dimension.sql }}
      {%- endfor %}
  {%- endif %}
  {%- endfor %}