Guides
Data modeling
Using dynamic union tables

Using dynamic union tables

Use case

Sometimes, you may have a lot of tables in a database, which actually relate to the same entity.

For example, you can have “per client” tables with the same data, but related to different customers: elon_musk_table, john_doe_table, steve_jobs_table, etc. In this case, it would make sense to create a single cube for customers, which should be backed by a union table from all customers tables.

Data modeling

You can use the sql parameter to define a cube over an arbitrary SQL query, e.g., a query that includes UNION or UNION ALL operators:

YAML
JavaScript
cubes:
  - name: customers
    sql: >
      SELECT *, 'Einstein' AS name FROM einstein_data UNION ALL
      SELECT *, 'Pascal'   AS name FROM pascal_data   UNION ALL
      SELECT *, 'Newton'   AS name FROM newton_data
  
    measures:
      - name: count
        type: count
  
    dimensions:
      - name: name
        sql: name
        type: string
 
 
 
 
 

However, it can be quite annoying to write the SQL to union all tables manually. Luckily, you can use dynamic data modeling to generate necessary SQL based on a list of tables:

YAML
JavaScript
{%- set customer_tables = {
  "einstein_data": "Einstein",
  "pascal_data": "Pascal",
  "newton_data": "Newton"
} -%}
 
cubes:
  - name: customers
    sql: >
      {%- for table, name in customer_tables | items %}
      SELECT *, '{{ name | safe }}' AS name FROM {{ table | safe }}
      {% if not loop.last %}UNION ALL{% endif %}
      {% endfor %}
  
    measures:
      - name: count
        type: count
  
    dimensions:
      - name: name
        sql: name
        type: string
 
 
 
 
 

Result

Querying count and name members of the dynamically defined customers cube would result in the following generated SQL:

SELECT
  "customers".name "customers__name",
  count(*) "customers__count"
FROM
  (
    SELECT
      *,
      'Einstein' AS name
    FROM
      einstein_data
    UNION ALL
    SELECT
      *,
      'Pascal' AS name
    FROM
      pascal_data
    UNION ALL
    SELECT
      *,
      'Newton' AS name
    FROM
      newton_data
  ) AS "customers"
GROUP BY
  1
ORDER BY
  2 DESC