Documentation
AI API

AI API

The AI API provides a standard interface for interacting with large language models (LLMs) as a turnkey solution for text-to-semantic layer queries.

Specifically, you can send the AI API a message (or conversation of messages) and it will return a Cube REST API query. Optionally, it will also run the query and return the results.

The AI API is available on Cube Cloud only. It is currently in preview and should not be used for production workloads. Please contact your Cube representative to have it enabled for your account.

See AI API reference for the list of supported API endpoinsts.

Configuration

While the AI API is in preview, your Cube account team will enable and configure it for you.

If you wish to enable or disable the AI API on a specific Cube deployment, you may do so by going to "Settings" in the Cube Cloud sidebar, then "Configuration", and then toggling the "AI API" configuration flag switch.

To find your AI API endpoint in Cube Cloud, go to the Overview page, click API credentials, and choose the AI API tab.

Getting Started

Data modeling

The AI API currently requires views in order to generate queries. This is because:

  1. Views let you create carefully-curated datasets, resulting in better outputs from LLMs. That is, you can choose exactly what is "ready" for the AI to see and what is not.
  2. Views define deterministic joins between Cubes, so the LLM does not have to "guess" at join ordering

To use the AI API, set up one or more views before getting started.

By default, the AI API syncs data model changes hourly. To manually trigger a sync, go to "Settings" in the Cube Cloud sidebar, then "Data Catalog Services", then hit "Sync" on the Cube connection.

Authentication

Authentication works the same as for the REST API.

The API Token is passed via the Authorization Header. The token itself is a JSON Web Token (opens in a new tab), the Security section describes how to generate it.

Example request

Given the data model from the "data modeling" section above, you could send a request with the following body:

{
  "messages": [
    {
      "role": "user",
      "content": "Where do we have the highest aov this year?"
    }
  ]
}

Based on the view(s) provided, the AI API generates a Cube REST API request that could be used to answer the user's question. For example, you might receive the following response:

{
  "message": "To find where we have the highest Average Order Value (AOV) this year, we can analyze the data by comparing the AOV across different dimensions such as cities or states.",
  "cube_query": {
    "measures": ["orders_view.average_order_value"],
    "dimensions": ["orders_view.users_city"],
    "timeDimensions": [
      {
        "dimension": "orders_view.created_at",
        "dateRange": "this year"
      }
    ],
    "order": {
      "orders_view.average_order_value": "desc"
    },
    "limit": 10
  }
}

See running queries for details on how to run the Cube query generated.

Running queries

You have two possible ways to run the query:

1. runQuery parameter

Use the runQuery request parameter to have the AI API run the query and report results back. When doing this, the request above would become:

{
  "messages": [
    {
      "role": "user",
      "content": "Where do we have the highest aov this year?"
    }
  ],
  "runQuery": true
}

The response will be the same as above, followed by a second JSON object representing the response (see the REST API reference for its format).

Note that the response now contains two JSON objects separated by a newline (\n). You are responsible for parsing these appropriately.

2. /load

Alternatively, you may take the generated cube_query from the response and then call the REST API /load endpoint with it in the /load request body. This is recommended for advanced use-cases where you need more control over formatting, pagination, etc. or if you are adding the AI API to an existing Cube REST API implementation.

Error Handling

Occasionally you may encounter errors. There are a few common categories of errors:

1. Cannot answer question

If the AI API is unable to generate a query because the view(s) in your data model do not have the appropriate fields to answer the question, you will receive a message like the following, and no cube_query in the response:

{
    "message": "I'm sorry, but the current data modeling doesn't cover stock prices or specific company data like NVDA. I will notify the data engineering team about this request."
}

2. Invalid query

Occasionally, the AI API may generate a query that is invalid or cannot be run. When this happens, you will receive an error upon running the query.

One way of handling this is to pass the error message back into the AI API; it may then self-correct and provide a new, valid query.

3. Continue wait

When using "runQuery": true, you might sometimes receive a query result containing { "error": "Continue wait" }. If this happens, you should use /load (described above) instead of runQuery to run the query, and handle retries as described in the REST API documentation.

Advanced Usage

The advanced features discussed here are available on Cube version 1.1.7 and above.

Custom prompts

You can prompt the AI API with custom instructions. For example, you may want it to always respond in a particular language, or to refer to itself by a name matching your brand. Custom prompts also allow you to give the model more context on your company and data model, for example if it should usually prefer a particular view.

To use a custom prompt, set the CUBE_CLOUD_AI_API_PROMPT environment variable in your deployment.

Custom prompts add to, rather than overwrite, the AI API's existing prompting, so you do not need to re-write instructions around how to generate the query itself.

Meta tags

The AI API can read meta tags on your dimensions, measures, segments, and views.

Use the ai meta tag to give context that is specific to AI and goes beyond what is included in the description. This can have any keys that you want. For example, you can use it to give the AI context on possible values in a categorical dimension:

      - name: status
        sql: status
        type: string
        meta:
          ai:
            values:
              - shipped
              - processing
              - completed

Other LLM providers

These environment variables also apply to the AI Assistant, if it is enabled on your deployment.

If desired, you may "bring your own" LLM model by providing a model and API credentials for a supported model provider. Do this by setting environment variables in your Cube deployment. See below for required variables by provider (required unless noted):

AWS Bedrock

The AI API currently supports only Anthropic Claude models on AWS Bedrock. Other models may work but are not fully supported.

  • CUBE_BEDROCK_MODEL_ID - A supported AWS Bedrock chat model (opens in a new tab), for example anthropic.claude-3-5-sonnet-20241022-v2:0
  • CUBE_BEDROCK_ACCESS_KEY - An access key for an IAM user with InvokeModelWithResponseStream permissions on the desired region/model.
  • CUBE_BEDROCK_ACCESS_SECRET - The corresponding access secret
  • CUBE_BEDROCK_REGION_ID - A supported AWS Bedrock region, for example us-west-2

GCP Vertex

The AI API currently supports only Anthropic Claude models on GCP Vertex. Other models may work but are not fully supported.

  • CUBE_VERTEX_MODEL_ID - A supported GCP Vertex chat model, for example claude-3-5-sonnet@20240620
  • CUBE_VERTEX_PROJECT_ID - The GCP project the model is deployed in
  • CUBE_VERTEX_REGION - The GCP region the model is deployed in, for example us-east5
  • CUBE_VERTEX_CREDENTIALS - The private key for a service account with permissions to run the chosen model

OpenAI

  • OPENAI_MODEL - An OpenAI chat model ID, for example gpt-4o
  • OPENAI_API_KEY - An OpenAI API key (we recommend creating a service account for the AI API)