Jupyter
You can connect to Cube from Jupyter using the Cube SQL API. The Jupyter Notebook is a web application for creating and sharing computational documents.
Here's a short video guide on how to connect Jupyter to Cube.
Enable Cube SQL API
Don't have a Cube project yet? Learn how to get started here.
Cube Cloud
Click Deploy SQL API and then the How to connect your BI tool link on the Overview page of your Cube deployment. Navigate to the BIs and Visualization Tools tab. You should see the screen like the one below with your connection credentials:
Self-hosted Cube
You need to set the following environment variables to enable the Cube SQL API. These credentials will be required to connect to Cube from Jupyter later.
CUBEJS_PG_SQL_PORT=5432
CUBEJS_SQL_USER=myusername
CUBEJS_SQL_PASSWORD=mypassword
Connecting from Jupyter
Jupyter connects to Cube as to a Postgres database.
Make sure to install the sqlalchemy
and pandas
modules.
pip install sqlalchemy
pip install pandas
Then you can use sqlalchemy.create_engine
to connect to Cube's SQL API.
import sqlalchemy
import pandas
engine = sqlalchemy.create_engine(
sqlalchemy.engine.url.URL(
drivername="postgresql",
username="cube",
password="9943f670fd019692f58d66b64e375213",
host="thirsty-raccoon.sql.aws-eu-central-1.cubecloudapp.dev",
port="5432",
database="db@thirsty-raccoon",
),
echo_pool=True,
)
print("connecting with engine " + str(engine))
connection = engine.connect()
# ...
Querying data
Your cubes will be exposed as tables, where both your measures and dimensions are columns.
You can write SQL in Jupyter that will be executed in Cube. Learn more about Cube SQL syntax on the reference page.
# ...
query = "SELECT SUM(count), status FROM orders GROUP BY status;"
df = pandas.read_sql_query(query, connection)
In your Jupyter notebook it'll look like this.
You can also create a visualization of the executed SQL query.