Single-Cluster vs Multi-Cluster Warehouse in Snowflake

Kajal Raut
3 min readFeb 13, 2022

--

Source: Google

In this post, we will see

  1. What are the different types of warehouse infrastructures in Snowflake?
  2. How are they different?

Lets Start.

A warehouse, you can think of it as a server that does some work for you, here executes the query.

There are two types of warehouse infrastructures.

(Note: We are not talking about warehouse sizes like XS,S, M, XL etc.)

  1. Single-Cluster warehouses

By default, a virtual warehouse is a single-cluster of compute resource. It has only 1 server/compute resource/warehouse executing the submitted queries.

Because it has only one warehouse, it might become a problem when we have lot of queries being queued for execution during peak hours, if sufficient resources are not available to execute all the queries submitted to the warehouse.

To solve concurrency problem, we can use multi-cluster warehouse.

2. Multi-Cluster warehouses

A multi-cluster warehouse, as the name suggests, has multiple warehouses executing the queries. It can have 1 or more warehouses of the same size per cluster.

Two important properties of multi-cluster warehouse are

  • Maximum number of warehouses, greater than 1 (up to 10).
  • Minimum number of warehouses, equal to or less than the maximum (up to 10)

There are two modes to run a multi-cluster warehouse:

  1. Maximized

This mode is enabled when same value is specified for both maximum and minimum number of warehouses (larger than 1). In this mode, when the multi-cluster warehouse is started, Snowflake starts all the warehouses so that maximum resources are available.

It’s useful if you have large numbers of concurrent user sessions and/or queries and the numbers do not fluctuate significantly.

2. Auto-scale

This mode is enabled when different values are specified for maximum and minimum number of warehouses. In this mode, Snowflake starts and stops warehouses as needed to dynamically manage the load on the multi-cluster warehouse:

  • Snowflake automatically starts additional warehouses as queries start to queue due to insufficient resources, up to the maximum number defined for the multi-cluster warehouse.
  • Similarly, as the load on the multi-cluster warehouse decreases, Snowflake automatically shuts down warehouses to reduce the number of running warehouses.

Two scaling policies are available to use in Auto-scale mode for multi-cluster warehouses.

These policies help to control the credits consumed by a multi-cluster warehouse running in Auto-scale mode. In Maximized mode, all warehouses run concurrently so there is no need to start or shut down individual warehouses.

Brief introduction to scaling policies:

  1. Standard (default) — Prevents/minimizes queuing by favoring starting additional warehouses over conserving credits.
  2. Economy — Conserves credits by favoring keeping running warehouses fully-loaded rather than starting additional warehouses, which may result in queries being queued and taking longer to complete.

Read more about scaling policies at Snowflake documentation.

Note: Multi-cluster warehouses are best utilized for scaling resources to improve concurrency for users/queries. They are not as beneficial for improving the performance of slow-running queries or data loading. For these types of operations, resizing the warehouse provides more benefits.

That’s all for this post.

Do visit following blogs to learn more about Snowflake.

--

--

No responses yet