Implementing funnel analysis
This functionality only works with data models written in JavaScript, not YAML. For more information, check out the Data Modeling Syntax page.
Funnels represent a series of events that lead users towards a defined goal. Funnel analysis is an approach commonly used in product, marketing and sales analytics.
Regardless of the domain, every funnel has the following traits:
- The identity of the object moving through the funnel – e.g. user or lead
- A set of steps, through which the object moves
- The date and time of each step
- The time to convert between steps
Since funnels have a pretty standard structure, they are good candidates for being extracted into reusable packages. Cube comes pre-packaged with a standard funnel package.
// First step is to require the Funnel package
const Funnels = require(`Funnels`);
cube(`PurchaseFunnel`, {
extends: Funnels.eventFunnel({
userId: {
sql: `user_id`,
},
time: {
sql: `timestamp`,
},
steps: [
{
name: `view_product`,
eventsView: {
sql: `select * from events where event = 'view_product'`,
},
},
{
name: `purchase_product`,
eventsView: {
sql: `select * from events where event = 'purchase_product'`,
},
timeToConvert: "1 day",
},
],
}),
});
Cube will generate an SQL query for this funnel. Since funnel analysis in SQL is not straight forward, the SQL code itself is quite complicated, even for such a small funnel.
Show Funnel's SQL
SELECT
purchase_funnel.step "purchase_funnel.step",
count(purchase_funnel.user_id) "purchase_funnel.conversions"
FROM
(
WITH joined_events AS (
select
view_product_events.user_id view_product_user_id,
purchase_product_events.user_id purchase_product_user_id,
view_product_events.t
FROM
(
select
user_id user_id,
timestamp t
from
(
select
*
from
events
where
event = 'view_product'
) e
) view_product_events
LEFT JOIN (
select
user_id user_id,
timestamp t
from
(
select
*
from
events
where
event = 'purchase_product'
) e
) purchase_product_events ON view_product_events.user_id = purchase_product_events.user_id
AND purchase_product_events.t >= view_product_events.t
AND (
purchase_product_events.t :: timestamptz AT TIME ZONE 'America/Los_Angeles'
) <= (
view_product_events.t :: timestamptz AT TIME ZONE 'America/Los_Angeles'
) + interval '1 day'
)
select
user_id,
first_step_user_id,
step,
max(t) t
from
(
SELECT
view_product_user_id user_id,
view_product_user_id first_step_user_id,
t,
'View Product' step
FROM
joined_events
UNION ALL
SELECT
purchase_product_user_id user_id,
view_product_user_id first_step_user_id,
t,
'Purchase Product' step
FROM
joined_events
) as event_steps
GROUP BY
1,
2,
3
) AS purchase_funnel
WHERE
(
purchase_funnel.t >= '2018-07-01T07:00:00Z' :: timestamptz
AND purchase_funnel.t <= '2018-07-31T06:59:59Z' :: timestamptz
)
GROUP BY
1
ORDER BY
2 DESC
LIMIT
5000
Funnel parameters
userId
A unique key to identify the users moving through the funnel.
userId: {
sql: `user_id`;
}
nextStepUserId
In the situation where user_id
changes between steps, you can pass a unique
key to join two adjacent steps. For example, if a user signs in after having
been tracked anonymously until that point in the funnel, you could use
nextStepUserId
to define a funnel where users are tracked by anonymous ID on
the first step and then by an identified user ID on subsequent steps.
const Funnels = require(`Funnels`);
cube(`OnboardingFunnel`, {
extends: Funnels.eventFunnel({
userId: {
sql: `id`,
},
time: {
sql: `timestamp`,
},
steps: [
{
name: `View Page`,
eventsView: {
sql: `select anonymous_id as id, timestamp from pages`,
},
},
{
name: `Sign Up`,
eventsView: {
sql: `select anonymous_id as id, user_id, timestamp from sign_ups`,
},
nextStepUserId: {
sql: `user_id`,
},
timeToConvert: "1 day",
},
{
name: `Action`,
eventsView: {
sql: `select user_id as id from actions`,
},
timeToConvert: "1 day",
},
],
}),
});
time
A timestamp of the event.
time: {
sql: `timestamp`;
}
steps
An array of steps. Each step has 2 required and 1 optional parameters:
- name (required) - Name of the step. It must be unique within a funnel.
- eventsView (required) - Events table for the step. It must contain
userId
andtime
fields. For example, if we have defined the userId asuser_id
and time astimestamp
, we need to have these fields in the table we're selecting from. - timeToConvert (optional) - A time window during which conversion should
happen. Set it depending on your funnel logic. If this is set to
1 day
, for instance, it means the funnel will include only users who made a purchase within 1 day of visiting the product page.
steps: [
{
name: `purchase_product`,
eventsView: {
sql: `select * from events where event = 'purchase_product'`,
},
timeToConvert: "1 day",
},
];
Joining funnels
In order to provide additional dimensions, funnels can be joined with other
cubes using user_id
at the first step of a funnel. This will always use a
many_to_one
relationship, hence you should always join with the corresponding
user cube. Here, by 'user' we understand this to be any entity that can go
through a sequence of steps within funnel. It could be a real web user with an
auto assigned ID or a specific email sent by an email automation that goes
through a typical flow of events like 'sent', 'opened', 'clicked', and so on.
For example, for our PurchaseFunnel
we can add a join to another funnel as
following:
cube(`PurchaseFunnel`, {
joins: {
Users: {
relationship: `many_to_one`,
sql: `${CUBE}.first_step_user_id = ${Users.id}`,
},
},
extends: Funnels.eventFunnel({
// ...
}),
});
Using funnels
Cube is based on multidimensional analysis (opens in a new tab)
Funnel-based cubes have the following structure:
Measures
- conversions - Count of conversions in the funnel. The most useful when broken down by steps. It's the classic funnel view.
- conversionsPercent - Percentage of conversions. It is useful when you want to inspect a specific step, or set of steps, and find out how a conversion has changed over time.
Dimensions
- step - Describes funnels' steps. Use it to break down conversions or conversionsPercent by steps, or to filter for a specific step.
- time - time dimension for the funnel. Use it to filter your analysis for specific dates or to analyze how conversion changes over time.
In the following example, we use the conversions
measure along with the
steps
dimension to display a classic bar chart showing the funnel's steps.
Performance considerations
Funnel joins are extremely heavy for most modern databases and complexity grows
in a non-linear way with the addition of steps. However, if the cardinality of
the first event isn't too high, very simple optimization can be applied:
originalSql
pre-aggregation.
It is best to use partitioned rollups to cache the
steps instead. Add one to the PurchaseFunnel
cube as follows:
cube(`PurchaseFunnel`, {
extends: Funnels.eventFunnel({
// ...
}),
preAggregations: {
main: {
type: `originalSql`,
},
},
});