Unlocking Real-time Analytics: Exploring Drizzly's Streaming Platform

Unlocking Real-time Analytics: Exploring Drizzly's Streaming Platform

Table of Contents

  1. Introduction
  2. Analytics Organization at Drizzly
  3. Batch Infrastructure
  4. Real-time Infrastructure and the Need for Real-time
  5. Challenges in Implementing Real-time Infrastructure
  6. The Role of KSQL and DBT in Solving Real-time Challenges
  7. Overview of Drizzly's Infrastructure
  8. Materialize: The Real-time SQL Solution
  9. Integration of DBT with Real-time Analytics
  10. Next Steps: Bringing Tecton into the Stack
  11. Future Possibilities with Real-time Analytics

📈 Introduction

In today's article, we will explore the architecture of Drizzly's streaming platform and delve into the various tools and technologies used. The focus will be on the analytics, batch, and real-time infrastructures, with an emphasis on the challenges faced and the solutions implemented. Let's dive in and uncover the intricacies of Drizzly's data platform!

🔬 Analytics Organization at Drizzly

Drizzly's analytics organization is structured into various teams, including data infra, business intelligence, marketing analytics, and data science. While these teams work closely together, it is important to note that the data engineering team, responsible for setting up the infrastructure, operates independently. This approach allows anyone at Drizzly to access the data platform, which includes data pipelines, business intelligence tools, batch infrastructure, data science, machine learning, and real-time infrastructure.

🚀 Batch Infrastructure

In line with modern data stack principles, Drizzly has been actively building its batch infrastructure over the past two years. The data warehouse of choice is Snowflake, known for its scalability and performance. To manage all data transformations within Snowflake, Drizzly utilizes DBT (Data Build Tool). DBT enables the team to write SQL queries and apply logic to transform and manipulate the data effectively. With the batch infrastructure in place, Drizzly can leverage historical data for various analytical purposes.

⏲️ Real-time Infrastructure and the Need for Real-time

The true power of real-time infrastructure lies in its ability to provide Timely and Relevant information to users. Consider a Scenario where a Drizzly user adds items to their cart but forgets to check out. Currently, Drizzly relies on a 24-hour cadence for reminders, which may not be effective as users' needs may change within that time frame. With the new real-time infrastructure, Drizzly aims to send notifications to users who haven't checked out within 30 minutes of adding items to their cart, ensuring a higher conversion rate. However, implementing real-time functionalities poses complex challenges.

❓ Challenges in Implementing Real-time Infrastructure

One of the main challenges in real-time infrastructure is triggering actions based on the absence of an event. For instance, Drizzly needs to identify cases where a user adds items to their cart but does not proceed to checkout within 30 minutes. Initially, the team explored using KSQL, a streaming SQL engine, along with Confluent Cloud, but limitations with joins made it unsuitable. Fortunately, Drizzly was already utilizing DBT and was introduced to Materialize, a real-time SQL solution. This combination offered a more feasible approach to solving complex real-time challenges.

🏗️ The Role of KSQL and DBT in Solving Real-time Challenges

Drizzly's infrastructure consists of Confluent Cloud for managing schema registries and Kafka topics. For the abandoned cart use case, events such as "add to cart" and "checkout" are sent to Kafka topics. These topics are then read into source objects and materialized through Materialize. DBT is used to manage materialized views, which involve SQL queries to transform and manipulate the data, enabling the identification of abandoned carts. The processed data is then sent to outbound services for notification purposes.

🌐 Overview of Drizzly's Infrastructure

Drizzly's infrastructure comprises Snowflake, Confluent Cloud, Kafka topics, Materialize, and DBT. This combination allows the team to seamlessly handle batch and real-time processing, ensuring a holistic data platform. Additionally, Drizzly appreciates the compatibility between DBT and Materialize, as it allows for centralized logic and the use of familiar tools for both batch and real-time data processing requirements.

🚀 Materialize: The Real-time SQL Solution

Materialize empowers Drizzly's real-time infrastructure by enabling the creation of real-time SQL queries. The familiarity of SQL, coupled with Materialize's efficiency and compatibility with Snowflake, simplifies the transition to real-time analytics. Leveraging DBT's capabilities, analysts and data scientists at Drizzly can easily adopt real-time analytics, as they can continue working with the same standards and practices already established in the batch environment. The deployment of real-time models becomes as simple as writing SQL queries, further streamlining the process.

🤝 Integration of DBT with Real-time Analytics

One of the key benefits of Materialize is its compatibility with DBT, which allows for a unified approach to analytics across both batch and real-time processing. Analysts and data scientists who are already proficient with DBT can seamlessly transition to real-time analytics, utilizing their existing skills and knowledge. This integration also facilitates event-driven workflows by enabling the direct connection between applications and Materialize or utilizing sync objects to send processed data to outbound services.

📝 Next Steps: Bringing Tecton into the Stack

Drizzly is excited to incorporate Tecton, a feature store platform, into its stack. While the abandoned cart use case does not require Tecton currently, there are other use cases on the horizon. For example, Drizzly aims to leverage machine learning models to predict the top items a user may be interested in based on their browsing behavior. These predictions can then be used to send personalized notifications and recommendations, enhancing the user experience and driving conversions. Tecton's integration will enable Drizzly to Scale and operationalize these data science models effectively.

🚀 Future Possibilities with Real-time Analytics

Looking ahead, Drizzly envisions a V2 environment where real-time analytics plays a more prominent role. While V1 focuses on limited personalization, V2 aims to introduce predictive capabilities for browsing behavior and personalized recommendations. Drizzly believes that real-time analytics can directly influence dynamic experiences on its platform, such as triggering Quizzes to Gather user preferences, delivering tailored recommendations, and ultimately enhancing the overall user journey. The synergy between batch and real-time analytics, along with the adoption of advanced technologies, opens up a world of possibilities for Drizzly and its users.

🌟 Highlights

  • Drizzly's analytics organization encompasses various teams, including data infra, business intelligence, marketing analytics, and data science.
  • The batch infrastructure at Drizzly leverages Snowflake as the data warehouse and utilizes DBT for managing data transformations.
  • Real-time infrastructure at Drizzly aims to provide timely and relevant information to users, driving higher conversion rates.
  • Implementing real-time functionalities poses challenges, such as triggering actions based on the absence of events.
  • Materialize, in conjunction with DBT, enables Drizzly to overcome these challenges and effectively implement real-time analytics.
  • Drizzly's infrastructure comprises Snowflake, Confluent Cloud, Kafka topics, Materialize, and DBT, allowing for seamless batch and real-time processing.
  • Materialize empowers Drizzly's real-time infrastructure by leveraging SQL queries and providing compatibility with Snowflake and DBT.
  • The integration of DBT with real-time analytics facilitates a unified approach and streamlines the adoption of real-time capabilities.
  • Tecton, a feature store platform, will be incorporated into Drizzly's stack to enable scalability and operationalization of machine learning models.
  • The future of real-time analytics at Drizzly involves enhanced personalization, predictive capabilities, and dynamic user experiences.

Find AI tools in Toolify

Join TOOLIFY to find the ai tools

Get started

Sign Up
App rating
4.9
AI Tools
20k+
Trusted Users
5000+
No complicated
No difficulty
Free forever
Browse More Content