AI API
The AI API provides a standard interface for interacting with large language models (LLMs) as a turnkey solution for text-to-semantic layer queries.
Specifically, you can send the AI API a message (or conversation of messages) and it will return a Cube REST API query. Optionally, it will also run the query and return the results.
The AI API is available on Cube Cloud only. It is currently in preview and should not be used for production workloads. Please contact your Cube representative to have it enabled for your account.
See AI API reference for the list of supported API endpoinsts.
Configuration
While the AI API is in preview, your Cube account team will enable and configure it for you.
If you wish to enable or disable the AI API on a specific Cube deployment, you may do so by going to "Settings" in the Cube Cloud sidebar, then "Configuration", and then toggling the "AI API" configuration flag switch.
To find your AI API endpoint in Cube Cloud, go to the Overview page, click API credentials, and choose the AI API tab.
Getting Started
Data modeling
The AI API currently requires views in order to generate queries. This is because:
- Views let you create carefully-curated datasets, resulting in better outputs from LLMs. That is, you can choose exactly what is "ready" for the AI to see and what is not.
- Views define deterministic joins between Cubes, so the LLM does not have to "guess" at join ordering
To use the AI API, set up one or more views before getting started.
By default, the AI API syncs data model changes hourly. To manually trigger a sync, go to "Settings" in the Cube Cloud sidebar, then "Data Catalog Services", then hit "Sync" on the Cube connection.
Authentication
Authentication works the same as for the REST API.
The API Token is passed via the Authorization Header. The token itself is a JSON Web Token (opens in a new tab), the Security section describes how to generate it.
Example request
Given the data model from the "data modeling" section above, you could send a request with the following body:
{
"messages": [
{
"role": "user",
"content": "Where do we have the highest aov this year?"
}
]
}
Based on the view(s) provided, the AI API generates a Cube REST API request that could be used to answer the user's question. For example, you might receive the following response:
{
"message": "To find where we have the highest Average Order Value (AOV) this year, we can analyze the data by comparing the AOV across different dimensions such as cities or states.",
"cube_query": {
"measures": ["orders_view.average_order_value"],
"dimensions": ["orders_view.users_city"],
"timeDimensions": [
{
"dimension": "orders_view.created_at",
"dateRange": "this year"
}
],
"order": {
"orders_view.average_order_value": "desc"
},
"limit": 10
}
}
See running queries for details on how to run the Cube query generated.
Running queries
You have two possible ways to run the query:
1. runQuery
parameter
Use the runQuery
request parameter to have the AI API run the query and report results back. When doing this, the request above would become:
{
"messages": [
{
"role": "user",
"content": "Where do we have the highest aov this year?"
}
],
"runQuery": true
}
The response will be the same as above, followed by a second JSON object representing the response (see the REST API reference for its format).
Note that the response now contains two JSON objects separated by a newline
(\n
). You are responsible for parsing these appropriately.
2. /load
Alternatively, you may take the generated cube_query
from the response and then call the REST API /load
endpoint with it in the /load
request body. This is recommended for advanced use-cases where you need more control over formatting, pagination, etc. or if you are adding the AI API to an existing Cube REST API implementation.
Error Handling
Occasionally you may encounter errors. There are a few common categories of errors:
1. Cannot answer question
If the AI API is unable to generate a query because the view(s) in your data model do not have the appropriate fields to answer the question, you will receive a message like the following, and no cube_query
in the response:
{
"message": "I'm sorry, but the current data modeling doesn't cover stock prices or specific company data like NVDA. I will notify the data engineering team about this request."
}
2. Invalid query
Occasionally, the AI API may generate a query that is invalid or cannot be run. When this happens, you will receive an error upon running the query.
One way of handling this is to pass the error message back into the AI API; it may then self-correct and provide a new, valid query.
3. Continue wait
When using "runQuery": true
, you might sometimes receive a query result containing { "error": "Continue wait" }
. If this happens, you should use /load
(described above) instead of runQuery
to run the query, and handle retries as described in the REST API documentation.
Advanced Usage
The advanced features discussed here are available on Cube version 1.1.7 and above.
Custom prompts
You can prompt the AI API with custom instructions. For example, you may want it to always respond in a particular language, or to refer to itself by a name matching your brand. Custom prompts also allow you to give the model more context on your company and data model, for example if it should usually prefer a particular view.
To use a custom prompt, set the CUBE_CLOUD_AI_API_PROMPT
environment variable in your deployment.
Custom prompts add to, rather than overwrite, the AI API's existing prompting, so you do not need to re-write instructions around how to generate the query itself.
Meta tags
The AI API can read meta tags on your dimensions, measures, segments, and views.
Use the ai
meta tag to give context that is specific to AI and goes beyond what is
included in the description. This can have any keys that you want. For example, you can use it
to give the AI context on possible values in a categorical dimension:
- name: status
sql: status
type: string
meta:
ai:
values:
- shipped
- processing
- completed
Other LLM providers
These environment variables also apply to the AI Assistant, if it is enabled on your deployment.
If desired, you may "bring your own" LLM model by providing a model and API credentials for a supported model provider. Do this by setting environment variables in your Cube deployment. See below for required variables by provider (required unless noted):
AWS Bedrock
The AI API currently supports only Anthropic Claude models on AWS Bedrock. Other models may work but are not fully supported.
CUBE_BEDROCK_MODEL_ID
- A supported AWS Bedrock chat model (opens in a new tab), for exampleanthropic.claude-3-5-sonnet-20241022-v2:0
CUBE_BEDROCK_ACCESS_KEY
- An access key for an IAM user withInvokeModelWithResponseStream
permissions on the desired region/model.CUBE_BEDROCK_ACCESS_SECRET
- The corresponding access secretCUBE_BEDROCK_REGION_ID
- A supported AWS Bedrock region, for exampleus-west-2
GCP Vertex
The AI API currently supports only Anthropic Claude models on GCP Vertex. Other models may work but are not fully supported.
CUBE_VERTEX_MODEL_ID
- A supported GCP Vertex chat model, for exampleclaude-3-5-sonnet@20240620
CUBE_VERTEX_PROJECT_ID
- The GCP project the model is deployed inCUBE_VERTEX_REGION
- The GCP region the model is deployed in, for exampleus-east5
CUBE_VERTEX_CREDENTIALS
- The private key for a service account with permissions to run the chosen model
OpenAI
OPENAI_MODEL
- An OpenAI chat model ID, for examplegpt-4o
OPENAI_API_KEY
- An OpenAI API key (we recommend creating a service account for the AI API)