Data modeling
Working around string time dimensions

Working around string time dimensions

Cube always expects a timestamp with timezone (or compatible type) as an input to the time dimension.

However, there are a lot of cases when the underlying table's datetime information is stored as a string. Most SQL databases support datetime parsing which allows converting strings to timestamps. Let's consider an example cube for BigQuery:

  - name: events
      - name: date
        sql: PARSE_TIMESTAMP('%Y-%m-%d', date)
        type: time

In this particular cube, the date column will be parsed using the %Y-%m-%d format.

Please note that as we do not pass timezone parameter to PARSE_TIMESTAMP (opens in a new tab), it will set UTC as the timezone by default. You should always set timezone appropriately for parsed timestamps as Cube always does timezone conversions according to user settings.

Although query performance of big data backends like BigQuery or Presto won't likely suffer from date parsing, performance of RDBMS backends like Postgres most likely will. Adding timestamp columns with indexes or transforming the data upstream should strongly be considered in this case.