Case Study

Bluesky Helps Notion Improve Snowflake ROI by 25% without Compromising Team Productivity

Summary

Notion is a customizable platform for collaboration, documentation, and project management. Teams love Notion because it is highly flexible and easy to learn. It allows users to build custom pages using a rich library of tools and templates.  

As a result of this flexibility, Notion generates a massive amount of product usage data, which is stored and analyzed in the Snowflake data warehouse. Snowflake gives Notion tremendous flexibility to manage concurrent workloads and deliver timely insights. As the business grows, Notion can easily provision additional Snowflake resources. To identify opportunities for workload optimization and ongoing cost governance, Notion has also partnered with Bluesky.

Challenges

  1. Concurrent Workloads:  Notion’s analytics workloads are distributed across several Snowflake warehouses to reduce contention. This distribution helps query performance but some large warehouses are underutilized. 
  1. Business Priorities:  The Notion team needs to focus on building instead of optimization. The business has grown quickly and they are a small, agile team. 
  1. Risk Management:  Notion runs a combination of scheduled ELT jobs, reporting, and ad hoc queries. These analytics help Notion understand how users interact with their product, which is crucial to their continued success. Workload optimization efforts cannot reduce the availability of key business metrics.
  1. Team Productivity:  Since code changes will take engineering time, it’s important to choose queries that can be updated quickly to give meaningful cost and performance results. 

Solution

In only two weeks, Bluesky delivered an immediate cost reduction of 15% by tuning Notion’s warehouse settings. Through analyzing the frequency and patterns of Snowflake activity, Bluesky was able to identify the optimum warehouse size for each workload and provide recommendations for efficient auto suspend and auto clustering settings.

These changes were implemented without significant time from the Notion team.

Bluesky also analyzed storage costs to find unused and redundant data tables, plus opportunities to reduce the cost of table backups.

Over the next four weeks, Bluesky monitored the query activity to uncover an additional ~10% of savings. Bluesky identified the biggest opportunities for cost reduction by identifying redundant query patterns and inefficient joins. As a result of the changes, many of Notion’s long-running queries were optimized to run ~300% faster.

The Bluesky recommendations included:

  • Materializing subqueries used by multiple long-running queries.
  • Reducing spillage by adjusting the run schedule and/or warehouse size for the most expensive queries.
  • Determining which warehouse tables could be updated incrementally to reduce the daily processing time.

Bluesky provided not only the findings but also rankings to determine which recommendations would have the biggest impact with the lowest effort and risk. Many changes were automatically applied, but more complex query changes were tested and coordinated to avoid potential risks.

“Bluesky not only helps us get significantly more value out of Snowflake but also allows our engineering team to stay focused on business-critical projects. Our team doesn’t have the bandwidth to dedicate resources to cost and efficiency initiatives. Instead, we rely on Bluesky to optimize our workloads and provide continuous workload governance. Bluesky’s Snowflake Copilot product leads to an even better Snowflake user experience.” – XZ Tie, Data Platform Eng Lead at Notion

The Future

Snowflake provides a dynamic data warehouse platform to support Notion’s growing business. With unlimited and flexible scaling, Notion’s team can quickly provide actionable insights from the data warehouse. 

By adding Bluesky to the stack, Notion can focus on building new pipelines and dashboards, with confidence that warehouse and query performance are being monitored and optimized.

As Notion’s data volume increases and analytic requirements evolve, Bluesky will provide ongoing cost governance and continue to evaluate storage, warehouse settings, and query performance. Snowflake workloads will be monitored with Bluesky and Notion can configure notifications to alert on potential concerns in the warehouse.

Bluesky functions as a Snowflake co-pilot. As conditions in the workload change, Notion can count on Bluesky to discover and resolve inefficiencies, before they incur unnecessary costs. 

Download PDF File
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Download PDF