AI API

The AI API provides a standard interface for interacting with large language models (LLMs) as a turnkey solution for text-to-semantic layer queries.

Specifically, you can send the AI API a message (or conversation of messages) and it will return a Cube REST API query. Optionally, it will also run the query and return the results.

The AI API is available on Cube Cloud only. It is currently in preview and should not be used for production workloads.

See AI API reference for the list of supported API endpoints.

Configuration

While the AI API is in preview, your Cube account team will enable and configure it for you.

If you wish to enable or disable the AI API on a specific Cube deployment, you may do so by going to "Settings" in the Cube Cloud sidebar, then "Configuration", and then toggling the "AI API" configuration flag switch.

To find your AI API endpoint in Cube Cloud, go to the Overview page, click API credentials, and choose the AI API tab.

Getting Started

Data modeling

The AI API currently requires views in order to generate queries. This is because:

Views let you create carefully-curated datasets, resulting in better outputs from LLMs. That is, you can choose exactly what is "ready" for the AI to see and what is not.
Views define deterministic joins between Cubes, so the LLM does not have to "guess" at join ordering

To use the AI API, set up one or more views before getting started.

By default, the AI API syncs data model changes hourly. To manually trigger a sync, go to "Settings" in the Cube Cloud sidebar, then "Data Catalog Services", then hit "Sync" on the Cube connection.

Authentication

Authentication works the same as for the REST API.

The API Token is passed via the Authorization Header. The token itself is a JSON Web Token (opens in a new tab), the Security section describes how to generate it.

Example request

Given the data model from the "data modeling" section above, you could send a request with the following body:

{
  "messages": [
    {
      "role": "user",
      "content": "Where do we have the highest aov this year?"
    }
  ]
}

Based on the view(s) provided, the AI API generates a Cube REST API request that could be used to answer the user's question. For example, you might receive the following response:

{
  "message": "To find where we have the highest Average Order Value (AOV) this year, we can analyze the data by comparing the AOV across different dimensions such as cities or states.",
  "cube_query": {
    "measures": ["orders_view.average_order_value"],
    "dimensions": ["orders_view.users_city"],
    "timeDimensions": [
      {
        "dimension": "orders_view.created_at",
        "dateRange": "this year"
      }
    ],
    "order": {
      "orders_view.average_order_value": "desc"
    },
    "limit": 10
  }
}

See running queries for details on how to run the Cube query generated.

Running queries

You have two possible ways to run the query:

1. `runQuery` parameter

Use the runQuery request parameter to have the AI API run the query and report results back. When doing this, the request above would become:

{
  "messages": [
    {
      "role": "user",
      "content": "Where do we have the highest aov this year?"
    }
  ],
  "runQuery": true
}

The response will be the same as above, possibly followed by a second JSON object representing the response (see the REST API reference for its format).

In some cases, the AI API will not generate a query, i.e. there will be no cube_query key in the first JSON object. When that happens, there will be no second object generated, as there are no results to show. This is expected and may occur when the model needs more information or doesn't have the necessary fields to run the requested query.

Note that if the AI API generated a query, the response now contains two JSON objects separated by a newline (\n). You are responsible for parsing these appropriately.

2. `/load`

Alternatively, you may take the generated cube_query from the response and then call the REST API /load endpoint with it in the /load request body. This is recommended for advanced use-cases where you need more control over formatting, pagination, etc. or if you are adding the AI API to an existing Cube REST API implementation.

Error Handling

Occasionally you may encounter errors. There are a few common categories of errors:

1. Cannot answer question

If the AI API is unable to generate a query because the view(s) in your data model do not have the appropriate fields to answer the question, you will receive a message like the following, and no cube_query in the response:

{
    "message": "I'm sorry, but the current data modeling doesn't cover stock prices or specific company data like NVDA. I will notify the data engineering team about this request."
}

2. Invalid query

Occasionally, the AI API may generate a query that is invalid or cannot be run. When this happens, you will receive an error upon running the query.

One way of handling this is to pass the error message back into the AI API; it may then self-correct and provide a new, valid query.

3. Continue wait

When using "runQuery": true, you might sometimes receive a query result containing { "error": "Continue wait" }. If this happens, you should use /load (described above) instead of runQuery to run the query, and handle retries as described in the REST API documentation.

Advanced Usage

The advanced features discussed here are available on Cube version 1.1.7 and above.

Custom prompts

You can prompt the AI API with custom instructions. For example, you may want it to always respond in a particular language, or to refer to itself by a name matching your brand. Custom prompts also allow you to give the model more context on your company and data model, for example if it should usually prefer a particular view.

To use a custom prompt, set the CUBE_CLOUD_AI_API_PROMPT environment variable in your deployment.

Custom prompts add to, rather than overwrite, the AI API's existing prompting, so you do not need to re-write instructions around how to generate the query itself.

Meta tags

The AI API can read meta tags on your dimensions, measures, segments, and views.

Use the ai meta tag to give context that is specific to AI and goes beyond what is included in the description. This can have any keys that you want. For example, you can use it to give the AI context on possible values in a categorical dimension:

      - name: status
        sql: status
        type: string
        meta:
          ai:
            values:
              - shipped
              - processing
              - completed

Value search

By default, the AI API has no ability to see the contents of your data (for privacy reasons). However, this makes it difficult for the AI API to generate correct filters for some queries.

Imagine you have a categorical order_status dimension with the possible values "shipped", "processing", and "completed". Without value search, asking "how many complete orders did we have today" might get you a query filtering on order_status = 'Complete' instead of the correct order_status = 'completed'.

To solve this, the AI API can perform "value searches" where it introspects the values in selected categorical dimensions before running a query. Value search is opt-in and dimensions must be enabled for it individually. Currently, the AI API performs value search by running Cube queries using the contains filter operator against one or more chosen dimensions. The LLM will select dimensions from among those you have based on the question asked and generate possible values dynamically.

When running value search queries, the AI API passes through the security context used for the AI API request, so security is maintained and only dimensions the end user has access to are able to be searched.

To enable value search on a dimension, set the searchable field to true under the ai meta tag, as shown below:

    - name: order_status
      sql: order_status
      type: string
      meta:
        ai:
          searchable: true

Note that enabling Value Search may lead to slightly longer AI API response times when it is used but should result in significantly more accurate queries in many situations. Value Search can only be used on string dimensions.

Other LLM providers

These environment variables also apply to the AI Assistant, if it is enabled on your deployment.

If desired, you may "bring your own" LLM model by providing a model and API credentials for a supported model provider. Do this by setting environment variables in your Cube deployment.

CUBE_CLOUD_AI_COMPLETION_MODEL - The AI model name to use (varies based on provider). For example gpt-4o.
CUBE_CLOUD_AI_COMPLETION_PROVIDER - The provider. Must be one of the following:
- amazon-bedrock
- anthropic
- azure
- cohere
- databricks
- deepseek
- fireworks
- google-generative-ai
- google-vertex-ai
- google-vertex-ai-anthropic
- groq
- mistral
- openai
- openai-compatible (any provider with an OpenAI-compatible API; support may vary)
- snowflake
- together-ai
- x-ai

See below for required variables by provider (required unless noted):

AWS Bedrock

The AI API currently supports only Anthropic Claude models on AWS Bedrock. Other models may work but are not fully supported.

CUBE_CLOUD_AI_AWS_ACCESS_KEY_ID - An access key for an IAM user with InvokeModelWithResponseStream permissions on the desired region/model.
CUBE_CLOUD_AI_AWS_SECRET_ACCESS_KEY - The corresponding access secret
CUBE_CLOUD_AI_AWS_REGION - A supported AWS Bedrock region, for example us-west-2
CUBE_CLOUD_AI_AWS_SESSION_TOKEN - The session token (optional)

Anthropic

CUBE_CLOUD_AI_ANTHROPIC_API_KEY
CUBE_CLOUD_AI_ANTHROPIC_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

Microsoft Azure OpenAI

CUBE_CLOUD_AI_AZURE_RESOURCE_NAME
CUBE_CLOUD_AI_AZURE_API_KEY
CUBE_CLOUD_AI_AZURE_API_VERSION (optional)
CUBE_CLOUD_AI_AZURE_BASE_URL (optional)

Cohere

CUBE_CLOUD_AI_COHERE_API_KEY
CUBE_CLOUD_AI_COHERE_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

Databricks

The AI API uses Databricks Foundation Model APIs (opens in a new tab). Currently only databricks-claude-3-7-sonnet is supported, although other models may also work.

CUBE_CLOUD_AI_DATABRICKS_HOST - for example, your-instance-id.cloud.databricks.com (do not include https://)
CUBE_CLOUD_AI_DATABRICKS_TOKEN - your personal access token

DeepSeek

CUBE_CLOUD_AI_DEEPSEEK_API_KEY
CUBE_CLOUD_AI_DEEPSEEK_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

Fireworks

CUBE_CLOUD_AI_FIREWORKS_API_KEY
CUBE_CLOUD_AI_FIREWORKS_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

Google Generative AI

CUBE_CLOUD_AI_GOOGLE_GENERATIVE_AI_API_KEY
CUBE_CLOUD_AI_GOOGLE_GENERATIVE_AI_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

GCP Vertex AI

See Google Vertex AI (Anthropic) below if using Anthropic models

CUBE_CLOUD_AI_GOOGLE_VERTEX_PROJECT
CUBE_CLOUD_AI_GOOGLE_VERTEX_LOCATION
CUBE_CLOUD_AI_GOOGLE_VERTEX_CREDENTIALS
CUBE_CLOUD_AI_GOOGLE_VERTEX_PUBLISHER - defaults to google; change if using another publisher (optional)

GCP Vertex AI (Anthropic)

CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_PROJECT
CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_LOCATION
CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_CREDENTIALS
CUBE_CLOUD_AI_GOOGLE_VERTEX_ANTHROPIC_PUBLISHER - defaults to anthropic; change if using another publisher (optional)

Groq

CUBE_CLOUD_AI_GROQ_API_KEY
CUBE_CLOUD_AI_GROQ_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

Mistral

CUBE_CLOUD_AI_MISTRAL_API_KEY
CUBE_CLOUD_AI_MISTRAL_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

OpenAI

CUBE_CLOUD_AI_OPENAI_API_KEY
CUBE_CLOUD_AI_OPENAI_ORGANIZATION - (optional)
CUBE_CLOUD_AI_OPENAI_PROJECT - (optional)
CUBE_CLOUD_AI_OPENAI_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

OpenAI Compatible Providers

Use this provider if your provider is not listed on this page but provides an OpenAI compatible endpoint. Not all providers/models are supported.

CUBE_CLOUD_AI_OPENAI_COMPATIBLE_API_KEY
CUBE_CLOUD_AI_OPENAI_COMPATIBLE_BASE_URL

Snowflake Cortex

We recommend using claude-3-5-sonnet (or any newer Claude models available) on Snowflake Cortex with the Cube AI API. Other models may work but are not fully tested or supported.

The Snowflake Cortex LLM REST API uses key pair authentication. Please follow the steps in Snowflake's documentation (opens in a new tab) to generate a key and assign it to a Snowflake user.

We recommend creating a separate Snowflake user with limited permissions for use with the Cube AI API.

CUBE_CLOUD_AI_SNOWFLAKE_ACCOUNT
CUBE_CLOUD_AI_SNOWFLAKE_USERNAME
CUBE_CLOUD_AI_SNOWFLAKE_PRIVATE_KEY

Together AI

CUBE_CLOUD_AI_TOGETHER_API_KEY
CUBE_CLOUD_AI_TOGETHER_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

xAI (Grok)

CUBE_CLOUD_AI_X_AI_API_KEY
CUBE_CLOUD_AI_X_AI_BASE_URL - uses a different URL prefix for API calls, such as if you are using behind a proxy (optional)

Reference Privacy and Security