Introducing Bluesky: Intelligent Workload Optimization and Cost Governance for the Cloud

Mingsheng Hong

Intelligent Workload Optimization and Cost Governance for Your Data Cloud

Today, I’m excited to share what we have been building at Bluesky, the company I co-founded with Zheng Shao six months ago. We raised $8.8m in seed funding led by Greylock Ventures to accelerate the development of our core technology, deepen our product offerings for Snowflake and other cloud-based data systems, as well as build out a world class team. Below is the story of our journey and vision for the future of intelligent workload optimization and cost governance on modern data clouds. 

Bluesky

Our first SaaS offering provides unprecedented visibility into Snowflake workload usage/cost and delivers actionable insights and workload specific recommendations for maximum optimization. With Bluesky, Snowflake customers can focus on understanding and deriving value from their data, rather than spending time refining, managing and optimizing their environment. FinOps and Data Eng teams get peace of mind from knowing that data teams are following cloud financial best practices and optimizing workload performance to maximize their data cloud environment. 

How it Started

Zheng and I have been friends for over 18 years. We have both built careers building best-in-class data and ML infrastructure to support some of the world’s most challenging workloads. We were also both data nerds before data was cool. Most recently I’ve been enjoying learning about and practicing ER modeling, 3NF and cost-based query optimization. 

One of my early career highlights was to demonstrate to Prof. Mike Stonebraker that SQL query auto-tuning could out-perform human experts at Vertica. This was an important source of my technical confidence and optimism behind founding Bluesky. Similarly, Zheng led Uber’s data platform team to save the company $#@! (read: a jaw-dropping) amount of money every year. You don’t get that level of impact without tons of automation.

Towards the end of 2021 we felt ready to do something entrepreneurial with our shared interest in big data, SQL workload optimization and machine learning. We identified an emerging market for cost visibility and workload optimization that would be applicable to any organization needing to maximize their investment in platforms such as Snowflake and other modern data clouds. We are excited to be applying our learnings from web-scale companies like Dropbox, Facebook, Google and Uber to the thousands of data-driven organizations out there. We incorporated Bluesky in early 2022, raised a seed round, assembled a world class founding team and built out our first product. A mere 6 months later, we have onboarded over a dozen companies who are Snowflake users including fast growing SMBs and well-known enterprise brands. 

Market Opportunity  

Snowflake is famous for pioneering the cloud data warehouse, a scalable “data cloud” with consumption-based pricing. Snowflake has delivered significant business value in large part because of its familiar SQL interface and how easy it is to get a project up and running. However, these same traits also make it hard to manage from a financial perspective. As companies increasingly run workloads in the cloud, it has become significantly more challenging to properly attribute cost, monitor and optimize the workloads. This is where Bluesky comes in. Our first SaaS product provides organizations with better visibility into their workload changes and automatically recommends ways to optimize performance. In turn, Snowflake customers can concentrate on deriving value from their data rather than spending time managing and optimizing their data cloud environment.

Unique Approach to Optimization 

Our unique approach to workload optimization enables data teams to skip the tedious manual trial and error process of tuning and leverage an intelligent and automated solution to quickly find optimal data layouts and warehouse settings. The combination of three key principles differentiates our approach:

  1. Observe: Provide the visibility for you to know how to break down the cost across individual queries, query users and other key dimensions. Also provide a ranked list of expensive query groups. This is delivered through our key technologies such as profile-driven query cost attribution and pattern-based query grouping.
  2. Optimize: Provide customized recommendations prioritized by business impact based on how you use data and receive alerts with manually-configurable thresholds. This is similar to receiving a personalized prescription from a doctor’s visit.
  3. Operationalize: As you gain trust in our technology and also become tired of implementing our tuning ideas, you can onboard our second product (currently under development), where we provide auto-tuning capabilities to continuously improve your workload and business efficiency.

You can learn more about our approach and technology at www.getbluesky.io/technology.

Bluesky in Action 

Our mission is to give organizations better visibility into their workloads while also optimizing their queries for cost savings and speed. We have already helped more than a dozen companies achieve cost transparency and remove low ROI workloads. Here are three examples of the power of Bluesky. 

  • For one user, we managed to reduce their Snowflake spend by 20% while improving the latency of their query workloads by up to 500x - massive financial and operational improvements.
  • For another user, we once observed that a data cloud user accidentally spent $10,000 worth of credits running queries that repeatedly timed out, thereby generating 0 business value. This was at a company known to have fairly strict cost governance policies otherwise, such as requiring manager’s approval for a business meal of $100+. The lack of visibility caused the CIOs and CDOs to start losing sleep at night, worrying about the escalating costs without a tool to provide sufficient visibility into their workloads. Bluesky quickly brought visibility and intelligence to their data cloud workloads for ongoing query, performance and cost optimization. 
  • Another example of the power of Bluesky: A few weeks into using our SaaS product, one team accidentally launched expensive testing traffic (with low business value) with an annualized cost of ~$300K. Fortunately, this was caught by Bluesky, which led to an immediate take down of the workload, thereby minimizing the waste. 

What’s next

I’m extremely proud of what we’ve accomplished at Bluesky so far. If you are a Snowflake customer, we would love to partner and help you get even more value out of your data cloud. Engage with us on one of these next steps:

  1. Start a product trial / POV. This program is for Snowflake customers with an annual spend of more than $50,000. Contact us for more information. 
  2. Get a data cloud health check. For Snowflake users with an annual spend of no more than $50,000, we offer a free “cost efficiency check" and best practices.
  3. Learn more about our technology

Join our team

Bluesky is a remote first company with HQ in the bay area. We are hiring a diverse team of people passionate about big data, machine learning, modern data clouds and workload optimization. You can learn more about open opportunities here.

Finally, I want to thank our investors and founding team members for their unwavering support. Special thanks to our VC's for their company and support along the way: Greylock, DCF, Foundation Capital, Foothill Ventures, Fellows Fund, Firsthand Ventures, Harpoon, Operator Collective, and Wilson Sonsini. I’m truly grateful to be on this journey with you.

The market for a solution to optimize workloads and govern costs on modern data clouds is growing fast and this is just the start for Bluesky. We are currently working with Snowflake and endeavor to form partnerships with other leading modern data cloud vendors. Stay tuned for future announcements.

The sky’s the limit as we say at Bluesky!