What is Embedded Analytics?
Embedded analytics is the practice of integrating data analytics and visualization capabilities within a user’s natural workflow. Think about your internal SEO tooling, your fitness tracker, or your e-commerce web portals (how’s the knit scarf business going, btw?)
Embedded analytics is an essential component of a smooth user experience now that we’re all in an exclusive relationship with Our Data. It’s almost a bit painful to imagine a process in which we’d need to run reports manually, download them menially, and only then analyze data in a third-party tool. And we do: the Harvard Business Review found that employees spend 9% of their time toggling between applications to look for information.
Therefore, many companies and organizations are turning to embedded analytics to give their internal and external users great data experiences. And that begs the question: what are the components of a great embedded analytics tools stack?
What are the components of a great embedded analytics tools stack?
In the olden days—approximately 2.38 years ago—an embedded analytics tools stack began with the data source and ended with the presentation layer (with nothing else in there).
The components of a good modern embedded analytics stack are:
- Data Source: where your data is aggregated and stored for downstream consumption.
- Semantic Layer: where your metrics definitions, data access controls, and caching live.
- Presentation Layer: where you build and how you deliver beautiful embedded analytics experiences to your users.
So, what’s up with #2? Why shouldn’t you simply connect your data source to your presentation layer?
Actually—for many reasons. For one, if you’re using multiple presentation layers to give a customized UX, you really shouldn’t be manually redefining metrics definitions and security context for each—doing so is redundant at best and hazardous (both for accuracy and for security) at worst.
For a second reason—your users shouldn’t be querying directly from the data source; that’s also redundant at best and hazardous (for application performance and cost-effectiveness) at worst.
And so, these days, it’s best practice to have something sit between the data source and the application—a place to consistently define how metrics are calculated, how information is protected, and how data is cached. And at Cube, we built an API-first platform that operates as all of those things: the Semantic Layer.
What makes good embedded analytics tools…good?
1. They’re fast to market:
Embedded analytics is a wildly important feature in the user experience. It can also, however, be a wildly dragged-out and expensive thing to build if done incorrectly. This is why it’s crucial to use embedded analytics tools in your stacks with a semantic layer that centralizes metrics.
2. They’re granularly secure:
When many users across different accounts access the same endpoints, ubiquitous row- and column-based data access controls are beyond critical. Consistency is key here—which puts another nail in the coffin for manually defining security context in each presentation layer.
3. They’re supportive of a data-rich user experience
Whether you’re going for a more OOB, less native experience, or one with multiple custom presentation layers, your stack needs to be able to give users granular and interactive access to their data. This means the stack enables users to drill down, filter, or generate reports—allowing them to access the data they need in the most convenient ways.
4. They’re scalable and easily integrable:
Meaning that your tool stack should seamlessly integrate with the other components in it and other embedded analytics tools you may want to use in the future.
So, what’s an example of an embedded analytics tools stack?
We’ll start with a disclaimer: the particular stack of data source and presentation layers you might consider will differ depending on your use case, technical requirements, and business considerations.
But, if you’re looking to deliver a native embedded analytics experience quickly, here’s an example of a stack with all of the essential components:
- Data source
- Semantic Layer
- Presentation Layer(s)
1. Embedded Analytics Tools: Data Sources
General-purpose Cloud Data Warehouses
Let’s start with Snowflake. Snowflake is a cloud-based data platform that provides a unified and scalable solution for storing, managing, and analyzing large volumes of structured and semi-structured data. It’s designed to address the challenges associated with traditional data warehousing and analytics systems by leveraging the power of cloud computing and modern data processing techniques.
Some major pros of Snowflake include virtually unlimited scalability and automatic resource scaling based on demand; seamless integration with many data sources, an elastic compute model enabling a pay-for-what-you-use model; and an architecture that supports high concurrency and parallel processing.
That said, organizations should consider factors like cost management, data migration, and internet connectivity when evaluating Snowflake’s suitability for their specific use cases.
Databricks is another great option—a unified analytics platform that provides a collaborative environment for big data processing and machine learning. It offers numerous practical advantages, such as streamlined data processing, easy scalability, and advanced analytics capabilities. Additionally, Databricks simplifies the integration of various data sources and supports real-time data processing.
In this case, organizations should consider factors like the learning curve associated with its complex features and some limitations when controlling infrastructure configuration. But regardless, Databricks is a powerful tool for organizations seeking efficient data analysis and machine learning solutions.
Fast Analytics Databases
Fast analytics databases are high-performance data storage systems optimized for quick processing of large volumes of data. They leverage advanced indexing techniques, parallel processing, and in-memory computing to deliver rapid data retrieval and analysis. These databases enable organizations to gain valuable insights and make real-time data-driven decisions.
Take, for instance, ClickHouse: an open-source columnar DB management system designed for high-performance analytics. It excels at executing ad-hoc queries on large data volumes with blazing-fast speeds. ClickHouse is scalable, handles petabytes of data, and supports real-time processing for time series, log, and clickstream analysis. And so, while it does have a steeper learning curve and is optimized for read-heavy workloads, it's a powerful tool for complex analytics and data-driven insights.
Another option in the fast analytics database world is Druid, an open-source columnar store designed for real-time, high-performance analytics. It's optimized for sub-second queries and handling large data volumes, making it ideal for interactive analysis, flexible data modeling, and scalability. Druid offers fast data ingestion and query response times, enabling real-time analysis of streaming and historical data. Being a specialized and complex tool, it's more suited for advanced users and specific scenarios because, as a distributed system, careful configuration and architectural considerations are needed for optimal performance.
2. Embedded Analytics Tools: Semantic Layer
A semantic layer is a middleware that sits between your data source and your presentation layers. It helps you manage all of the information you feed into every downstream data app in your stack. And it’s critical for building embedded analytics.
Incorporating a semantic layer into your embedded analytics stack means you’re able to efficiently build native applications and ensure that the data that’s in them is consistent, secure, and fast.
We happen to know a lot about semantic layers because we’re Cube: the universal semantic layer. Cube enables you to access data from modern data stores, organize it into consistent metrics definitions, and then deliver them to every downstream application—while also enforcing uniform data governance and ensuring consistent performance and data freshness across every app and every team.
And, Cube exposes REST, GraphQL, and SQL APIs for universal compatibility and streamlined data provisioning with any downstream tool. Meaning, whether you’re building internal BI applications based on out-of-the-box iframes, developing custom, beautifully intricate data experiences for customers—or both—you can do it all based on Cube while only orchestrating data modeling, caching, and access controls in one place.
3. Embedded Analytics Tools: Presentation Layers
For internal-facing embedded analytics:
Let’s start with what we’ll call “Traditional BI” software. These platforms have been around for a while and, with their long tenure, have become the dominant players in the space. Again, we’ll preface this rundown with a disclaimer that this is not a complete list.
Traditional BIs are often popular with later-stage companies that have already defined their data needs. Ergo, rather than needing tools for data exploration (more on those later), these data consumers know what questions they need to ask every day; they just need set dashboards with which to continuously monitor the answers (or metrics, KPIs, etc.)
So, in one corner, we have Tableau: an established, heavy-weight player in the game. I say ‘heavy-weight’ because it enables analytics with a very large feature set—and, therefore, plenty of analytical flexibility. Tableau’s substantial feature set allows for comprehensive analytics, which makes it a great tool for technically skilled analysts.
In another corner, we have Microsoft Power BI, which is similar to Tableau in its popularity for building internal dashboards—but is possibly even more ubiquitous due to its frequent packaging with other Microsoft Office products. This bundling can make Power BI more cost-accessible than Tableau, although it does have tiered features and capabilities that gate advanced analytics features.
Lastly (although absolutely not least), we have Thoughtspot. Thoughtspot is a BI platform that, like Power BI and Tableau, offers data visualization. However, it differs in its search-based analytics. Meaning you can query it in natural language (rather than SQL) and receive a visualization or answer. This capability helps it make analytics accessible to a much wider pool of data consumers—a company’s data practitioners and general business users.
And then there’s another category of potential embedded analytics tools for internal use cases: ‘Interactive Notebooks.’
Interactive notebooks are powerful analytics tools that allow greater flexibility than OOB, self-serve, ‘traditional’ BI. These platforms enable technical data practitioners with more advanced analytics skills to:
- Connect and query multiple data sources
- Analyze information in a flexible and collaborative environment
- Build data apps and documents that are useful to both technical and non-technical data consumers
- Efficiently distribute the artifacts to those who need them.
This class of tools is popular among early-stage companies, for which the questions stakeholders are asking aren’t yet set in historical stone. Therefore, analysts are more inclined to do data exploration of operational analytics to find the metrics that are important to continuously track.
Let’s talk examples.
Hex is one of the most popular data notebooks out there. It offers streamlined connectivity to data sources and SQL- and Python-based analytics workspaces, making it a fantastic tool for data scientists or analysts to build easily shareable, interactive data apps (here’s our rundown about it.)
Another great notebook is Deepnote, which enables and streamlines real-time collaboration of the data engineers, data scientists, and analysts on your team; consider it the “Google Docs” of data. It’s both Jupyter-compatible and runs on the cloud, making it easy to run and integrate (here’s our take on it.)
And as a final example, we have Count. Count allows data consumers to collaborate on building everything from reports and dashboards to process flows and strategic plans. And, it makes itself accessible to non-technical users by also featuring templates, drag-and-drop functionality, and low-code/no-code interfaces.
For customer-facing embedded analytics:
And lastly, it’s time to tackle customer-facing embedded analytics—in which flexibility to build tailor-made, native UIs reign supreme.
There are many, many options of frameworks and libraries to choose from, but we’ll highlight a few.
First, let’s talk about React. React is a widely-used front-end framework that offers a rich ecosystem of charting libraries like Recharts and Victory. React's component-based architecture simplifies building reusable charts and seamlessly integrates with other React-based components. React's virtual DOM also enables efficient rendering, resulting in smooth and performant data visualizations. So with all of this, it’s commonly used for building interactive UIs and single-page applications (example).
There’s also D3.js. A charting library known for its flexibility and data-driven approach, D3.js is a popular choice for building customized and advanced visualizations. It provides a low-level API that gives developers complete control over the visualization process. While it requires a steeper learning curve, D3.js empowers engineers to create highly expressive and interactive charts (example).
And finally, Highcharts—is a powerful charting library with a wide range of chart types and customization options. Highcharts provides interactive and visually appealing charts that can be seamlessly integrated into applications and offers extensive documentation and a supportive community. Because it allows for a wide range of pre-built chart types, Highcharts is ideal for quickly creating interactive charts and graphs (example).
Best of both worlds: flexibility and time-to-value.
The important thing to note here is that today's embedded analytics tools stack is not what it was; it’s much better.
Gone are the days of having to choose between inflexible, ‘force-fit’ traditional BI or the resource drains of building custom data experiences from scratch. With a semantic layer, you can simply have both.
You can build native embedded analytics with minimal engineering resources for development and maintenance. And, you can ensure that your internal BI dashboards—in all 76 BI apps your company uses—work off the same metrics definitions, caching, and data access controls.
Curious about how others have done it? Check out our case study with RamSoft, which delivered native embedded analytics to users in two weeks with Cube.
And if you want to learn more about how to do that, too, reach out to us. Let’s talk.
Onward and upwards,