Announcing the Cube integration with Trino, the SQL query engine for big data

Today, we're happily announcing that Cube now works with Trino, the fast distributed SQL query engine for big data analytics, formerly known as PrestoSQL.

Meet the Cube team at Trino Summit on November 10, 2022. We'll be happy to chat and explore how Cube can augment your Trino experience.

What is Trino?

Trino is an open-source query engine designed to work with big data.

It's fast, scalable, SQL-compliant, and has almost universal connectivity to all kinds of data sources and business intelligence tools. Let's explore these in more detail:

Scale. Trino is capable of querying exabyte-scale data lakes and massive data warehouses, as confirmed by Trino installations at companies such as LinkedIn, Netflix, and Shopify.
Performance. Trino is a highly parallel and distributed query engine with two types of cluster nodes: coordinator nodes, responsible for query planning and execution, and worker nodes, responsible for fetching and processing data. You can fine-tune your Trino cluster to allow for efficient, low latency analytics.
Connectivity. Trino can natively query data from a plethora of data sources without the need to extract and load the data into a data warehouse, even if you'd like to do a cross-database join. Also, Trino is an ANSI SQL-compliant query engine that works with BI tools such as Tableau, Power BI, and Superset as well as headless BI tools such as Cube.

Most common use cases for Trino include ad-hoc analytics at interactive speeds, massive multi-hour batch queries, and high volume apps that perform sub-second queries.

Trino was originally designed and developed at Facebook in 2013 and beared the name Presto at that time. In 2019, Presto development forked and two query engines emerged: PrestoDB maintained by Facebook and PrestoSQL maintained by the Presto Software Foundation and the original creators. Later in 2020, PrestoSQL was renamed to Trino.

If you're not a Trino user yet, consider self-hosting or using a managed platform provided by Starburst or AWS (as Amazon Athena, serverless interactive query service).

What is Cube?

Cube is the headless BI platform for accessing data from modern data stores (including Trino), organizing it into consistent metrics definitions, and delivering them to downstream applications.

Cube schema

Cube is designed to take the central part in the data pipeline, delivering consistent data to all downstream teams and data consumers. It serves as a source of truth for the metrics definitions, access control rules, and caching settings. Regardless of how many data consumers you have (e.g., front-end applications with embedded analytics or BI tools), Cube will deliver consistent data to all of them with its REST API, GraphQL API, or SQL API.

How Cube works with Trino

In a typical data pipeline, Trino is placed upstream of Cube, providing unrestricted access to all data sources as well as data federation capabilities. You can use Cube to create an additional semantic layer or a last-mile caching layer on top of Trino.

More importantly, you can use the set of APIs that Cube provides, including REST API and GraphQL API, to deliver the data directly to custom-built front-end applications, retaining low latency and high concurrency.

Cube and Trino architecture

Also, you can leverage Cube's multitenancy and access control features to fully control and customize how every user, role, or data consumer accesses the data in Trino and underlying data sources.

Try Cube with Trino

Cube is open source, so you can feel free to self-host and deploy it. You can also run fully managed Cube in Cube Cloud which is the fastest and most convenient way to use Cube.

Start by signing up for a free Cube Cloud account. You'll be prompted to select the cloud provider (AWS, GCP, or Azure) and a region for your deployment:

Cube Cloud: 1st step

Then, pick Trino from the data source options:

Cube Cloud: 2nd step

Lastly, provide the credentials for the Trino connection:

Cube Cloud: 3rd step

In a few seconds, you'll get your Cube Cloud deployment up and running, ready to query your Trino installation and deliver data to downstream applications.

What's next?

Would you like to learn more about Cube and what it brings to the table? Check the docs and create a free Cube Cloud account today.

Also, please feel free to join our community of almost 7000 data practitioners on Slack, give Cube a star on GitHub, or schedule a time to chat 1:1.

What is Trino?

What is Cube?

How Cube works with Trino

Try Cube with Trino

What's next?

Upgrade your data stack today