How KazanExpress is using Cube for analytics in their marketplace offerings

Author avatarOleg TaizovFebruary 1, 2022User StoriesPostgreSQLClickHouse
How KazanExpress is using Cube for analytics in their marketplace offerings
Show Original

KazanExpress is a marketplace in Russia with free one-day delivery, regardless of the order value and region. We provide over three million products in over 20 product categories including clothing, electronics, and groceries.

Our marketplace consists of 4 offerings:

  1. B2C — everything related to the happiness of shoppers and the process of buying goods in the marketplace. We have more than a million visitors daily, and we are loved by more than four million shoppers.
  2. B2B — focuses on the experience of the seller and the process of selling goods in the marketplace. In recent years, more than 10,000 sellers became our partners and we were able to greatly improve their business.
  3. WMS — a warehouse management system that is responsible for a key part of our business — efficient packaging and fast delivery that allows us to deliver orders to more than 115 cities in Russia.
  4. ERP — a single system for automating and managing internal business processes of the marketplace. It helps build simple and effective links between different departments.

We were looking for an analytics solution—both for us and our sellers using the B2B product of KazanExpress—that can provide insight into our performance against our business goals. For KazanExpress, this includes how quickly new sellers can onboard to our marketplace, the effectiveness of existing sellers, our revenue stream, and the number of purchases on KazanExpress.

Our sellers needed insight on things like types of customer search queries, demands for product categories, product revenue forecast to help plan product availability in the warehouse. In addition, they also needed to understand what impact activities such as price changes, promotions, adding new SKUs, and even changing product photos had on their revenue.

Why we chose Cube at KazanExpress

Our analytics tool needed to be capable of doing aggregations from several types of databases (e.g. ClickHouse, Postgres, etc.) and easy to integrate with the KazanExpress infrastructure. We looked at three options and you'll see the pros and cons for each below:

ProsCons
Yandex DataLens
  • Connectors to Postgres and ClickHouse
  • Predefined set of chart types, customizable even by non-technical people
  • Support for materialization and caching
  • Supported by Yandex
  • No pre-aggregations mechanism
  • Large materialized views are not possible (a major roadblock)
  • Row-level security (RLS) doesn't support integration with 3rd party applications incl. our security layer (a major roadblock)
In-house solution with ETL and ClickHouse data marts
  • We have full control of the architecture and security layer
  • Easy to integrate with our existing infrastructure
  • Huge amount of developer resources required to implement ClickHouse data marts, interaction with security middleware, etc.
  • frontend developers required to build charts
Cube
  • frontend developers required to build charts
  • Lack of infrastructure tools in Cube (e.g. instrumentation/monitoring data, official Helm chart/Kubernetes deployment guide)

Obviously, Cube had a lot of pros and we decided to pursue the Cube route in July 2021. Below you will see one of our seller’s analytics dashboards that are powered by Cube.

Seller analytics dashboard

Implementing Cube at KazanExpress and our workflow

Our setup follows the recommended Cube deployment architecture that consists of Cube API, Cube Refresh Worker, external Redis for caching, and Cube Store cluster for working with pre-aggregations. Currently, Cube is connected to three Postgres databases and ClickHouse and you can see our architecture diagram below.

Architecture Diagram

In the Cube API, we implemented integration with our internal authorization provider to grant access to data in Cube using the same token used throughout our application.

Currently, we have 3 staging deployments in our K8s cluster. We templated the Cube platform using the Helm chart to simplify the deployment process and make it reproducible/extendable/observable. Here are our staging deployments:

  • Feature. A feature deployment like in Git flow. This is the playground where we can make any experiments with Cube or develop new features. And at the same time, we can test visualization on the frontend.
  • Development. A place where all approved features are located and we might build releases with required load-testing, regression testing, etc.
  • Production. A working copy of the platform that we are confident with and available to our sellers.

Each deployment has a full copy of the Cube platform that consists of Cube API, Cube Refresh Worker, Cube Store with its own bucket on S3, and Redis for caching. This workflow gives us the capability to test our hypotheses easily and instantly deliver features to our customers.

We had three people (a frontend developer, a backend developer, and a designer) that were involved in the implementation of Cube at KazanExpress that took about 3 1/2 weeks. We also implemented Cube in three phases:

  • Phase 1. Deployment of the Cube API into our Kubernetes cluster.
  • Phase 2. Setting up the Cube API with a refresh worker and pre-aggregations.
  • Phase 3. Setting up Cube Store.

Looking ahead

We look forward to continuing to expand our analytics offering with Cube to our sellers. This includes adding more Postgres data sources and providing anonymous statistics on shoppers' search queries, buyers' demographics/interests, product category information on impressions/views/sellers, etc.

Interested in joining KazanExpress' success with Cube?

Explore Cube examples & tutorials and get started today. To jumpstart your efforts, please join us on Discourse & Slack, follow us on Twitter, and get engaged with the growing Cube community.

share this article