Providing a custom data model for each tenant
Use case
We have multiple users and we would like them to have different data models. These data models can be completely different or have something in common.
Configuration
Let's assume that we have two users: Alice and Bob. We'll refer to them as
tenants. We're going to provide custom data models for these tenants by
implementing multitenancy.
Multitenancy
First of all, we need to define the following configuration options so that Cube knows how to distinguish between your tenants:
context_to_app_idto derive tenant identifiers from security contexts.scheduled_refresh_contextsto provide a list of security contexts.
Put the following code into your cube.py or cube.js configuration
file:
Data modeling
Customizing member-level access
The simplest way to customize the data models is by changing the member-level access to data model entities. It works great for use cases when tenants share parts of their data models.
By setting the public parameter of cubes, views,
measures, dimensions, and
segments, you can ensure that each tenant has its unique
perspective of the whole data model.
With the following data model, Alice will only have access to cube_a,
Bob will only have access to cube_b, and they both will have access to
select members of cube_x:
{% set tenant_id = COMPILE_CONTEXT['securityContext']['tenant_id'] %}
cubes:
- name: cube_a
sql_table: table_a
public: {{ tenant_id == 'Alice' }}
measures:
- name: count
type: count
- name: cube_b
sql_table: table_b
public: {{ tenant_id == 'Bob' }}
measures:
- name: count
type: count
- name: cube_x
sql_table: table_x
measures:
- name: count
type: count
- name: count_a
type: count
public: {{ tenant_id == 'Alice' }}
- name: count_b
type: count
public: {{ tenant_id == 'Bob' }}For your convenience, Playground ignores member-level access configration and marks data model entities that are not accessible for querying through APIs with the lock icon.
Here's what Alice sees:
And here's the perspective of Bob:
Customizing other parameters
Similarly to customizing member-level access, you can set other parameters of data model entities for each tenant individually:
- By setting
sqlorsql_tableparameters of cubes, you can ensure that each tenant accesses data from its own tables or database schemas. - By setting the
data_sourceparameter, you can point each tenant to its own data source, allowing to switch between database names or even database servers. - By setting the
extendsparameter, you can ensure that cubes of some tenants are enriched with custom measures, dimensions, or joins.
With the following data model, cube_x will read data from the Alice database
schema for Alice and from Bob database schema for Bob:
{% set tenant_id = COMPILE_CONTEXT['securityContext']['tenant_id'] %}
cubes:
- name: cube_x
sql_table: {{ tenant_id | safe }}.table_x
measures:
- name: count
type: countHere's the generated SQL for Alice:
And here's the generated SQL for Bob:
Dynamic data modeling
A more advanced way to customize the data models is by using dynamic data models. It allows to create fully customized data models for each tenant programmatically.
With the following data model, cube_x will have the count_a measure for
Alice and the count_b measure for Bob:
{% set tenant_id = COMPILE_CONTEXT['securityContext']['tenant_id'] %}
cubes:
- name: cube_x
sql_table: table_x
measures:
- name: count
type: count
{% if tenant_id == 'Alice' %}
- name: count_a
sql: column_a
type: count
{% endif %}
{% if tenant_id == 'Bob' %}
- name: count_b
sql: column_b
type: count
{% endif %}Here's the data model and the generated SQL for Alice:
And here's the data model and the generated SQL for Bob:
Loading from disk
You can also maintain independent data models for each tenant that you would load from separate locations on disk. It allows to create fully customized data models for each tenant that are maintained mostly as static files.
By using the repository_factory option with the
file_repository utility, you can load data model files for each tenant from
a custom path.
With the following configuration, Alice will load the data model files from
model/Alice while Bob will load the data model files from model/Bob:
Example
Here's an example of how to use this approach. Let's say we have a folder structure as follows:
model/
├── avocado/
│ └── cubes
│ └── Products.js
└── mango/
└── cubes
└── Products.jsLet's configure Cube to use a specific data model path for each tenant using the
repositoryFactory function along with contextToAppId and scheduledRefreshContexts:
const { FileRepository } = require("@cubejs-backend/server-core")
module.exports = {
contextToAppId: ({ securityContext }) =>
`CUBE_APP_${securityContext.tenant}`,
repositoryFactory: ({ securityContext }) =>
new FileRepository(`model/${securityContext.tenant}`),
scheduledRefreshContexts: () => [
{ securityContext: { tenant: "avocado" } },
{ securityContext: { tenant: "mango" } }
]
}In this example, we'll filter products differently for each tenant. For the avocado
tenant, we'll show products with odd id values, and for the mango tenant, we'll show
products with even id values.
This is the products cube for the avocado tenant:
cubes:
- name: products
sql: |
SELECT * FROM public.Products WHERE MOD (id, 2) = 1This is the products cube for the mango tenant:
cubes:
- name: products
sql: |
SELECT * FROM public.Products WHERE MOD (id, 2) = 0To fetch the products for different tenants, we send the same query but with different JWTs:
{
"sub": "1234567890",
"tenant": "Avocado",
"iat": 1000000000,
"exp": 5000000000
}{
sub: "1234567890",
tenant: "Mango",
iat: 1000000000,
exp: 5000000000,
}This approach produces different results for each tenant as expected:
// Avocado products
[
{
"products.id": 1,
"products.name": "Generic Fresh Keyboard",
},
{
"products.id": 3,
"products.name": "Practical Wooden Keyboard",
},
{
"products.id": 5,
"products.name": "Handcrafted Rubber Chicken",
},
]// Mango products:
[
{
"products.id": 2,
"products.name": "Gorgeous Cotton Sausages",
},
{
"products.id": 4,
"products.name": "Handmade Wooden Soap",
},
{
"products.id": 6,
"products.name": "Handcrafted Plastic Chair",
},
]You can find a working example of this approach on GitHub (opens in a new tab).
Run it with the docker-compose up command to see the results in your console.
Loading externally
Finally, you can maintain independent data models for each tenant that you would load from an external location rather from a folder on disk. Good examples of such locations are an S3 bucket, a database, or an external API. It allows to provide fully customized data models for each tenant that you have full control of.
It can be achieved by using the same repository_factory option.
Instead of using the file_repository utility, you would have to write your own
code that fetches data model files for each tenant.