Plan a CockroachDB Advanced Cluster

On this page Carat arrow pointing down

This page describes how to plan your CockroachDB Advanced cluster.

Planning your cluster

Before making any changes to your cluster's configuration, review the requirements and recommendations for CockroachDB Advanced clusters.

Cluster configuration

For single-region production deployments, Cockroach Labs recommends a minimum of 3 nodes. The number of nodes you choose also affects your storage capacity and performance. See the Example for more information.

Some of a CockroachDB Advanced cluster's provisioned memory capacity is used for system overhead factors such as filesystem cache and sidecars, so the full amount of memory may not be available to the cluster's workloads.

CockroachDB Advanced clusters use three Availability Zones (AZs). For balanced data distribution and best performance, we recommend using a number of nodes that is a multiple of 3 (for example, 3, 6, or 9 nodes per region).

Note:

You cannot scale a multi-node cluster down to a single-node cluster. If you need to scale down to a single-node cluster, back up your cluster and restore it into a new single-node cluster.

Advanced security features

If your cluster needs access to all features required for PCI DSS readiness, you should configure advanced security features while creating your CockroachDB Advanced cluster.

Multi-region clusters

Multi-region CockroachDB Advanced clusters must contain at least three regions to ensure that data replicated across regions can survive the loss of one region. For example, this applies to internal system data that is important for overall cluster operations as well as tables with the GLOBAL table locality or the REGIONAL BY TABLE table locality and REGION survival goal.

Each region of a multi-region cluster must contain at least 3 nodes to ensure that data located entirely in a region can survive the loss of one node in that region. For example, this applies to tables with the REGIONAL BY ROW table locality. We recommend you use the same number of nodes in each region of your cluster for best performance and stability.

You can configure a maximum of 9 regions per cluster through the Console. If you need to add more regions, contact your Cockroach Labs account team.

Cluster scaling

When scaling up your cluster, it is generally more effective to increase node size up to 16 vCPUs before adding more nodes. For example, if you have a 3 node cluster with 4 vCPUs per node, consider scaling up to 8 vCPUs before adding a fourth node. For most production applications, we recommend at minimum 8 vCPUs per node. New CockroachDB Cloud organizations created after September 26, 2024 can no longer scale a cluster down to fewer than 4 vCPUs per node.

We recommend you add or remove nodes from a cluster during a time when the cluster isn't experiencing heavy traffic. Adding or removing nodes incurs a non-trivial amount of load on the cluster and takes about 30 minutes per region. Changing the cluster configuration during times of heavy traffic can result in degraded application performance or longer times to apply node modifications. Before removing nodes from a cluster, ensure that the reduced disk space will be sufficient for the existing and anticipated data. If you remove regions from a cluster, access to the cluster from clients in those regions will no longer be as fast.

If you have changed the replication factor for a cluster, you might not be able to remove nodes from the cluster. For example, suppose you have a 5-node cluster and you had previously changed the replication factor from its default value of 3 to 5. Now if you attempt to scale down the cluster to 3 nodes, the decommission operation might fail. To successfully remove nodes from the cluster in this example, change the replication factor back to 3, and then remove the nodes.

Example

Let's say you want to create a cluster to connect with an application that is running on the Google Cloud Platform in the us-east1 region, and that the application requires 2000 transactions per second.

Suppose the raw data amount you expect to store without replication is 500 GiB. At 40% compression, you can expect a savings of 200 GiB, making the amount of data you need to store is 300 GiB.

Assume a storage buffer of 50% to account for overhead and data growth. The net raw data amount you need to store is now 450 GiB.

With the default replication factor of 3, the total amount of data stored is (3 * 450 GiB) = 1350 GiB.

To determine the number of nodes and the hardware configuration to store 1350 GiB of data, refer to the table in Create Your Cluster. One way to reach a 1350 GiB storage capacity is 3 nodes with 480 GiB per node, which gives you a capacity of (3*480 GiB) = 1440 GiB.

To meet your performance requirement of 2000 TPS, consider a configuration of 3 nodes with 4 vCPUs per node. This configuration has (3*4 vCPUs) = 12 vCPUs. Each vCPU can handle around 1000 TPS, so this configuration provides 12000 TPS, which exceeds your performance requirements.

Your final configuration is as follows:

Component Selection
Cloud provider GCP
Region us-east1
Number of nodes 3
Compute 4 vCPU
Storage 480 GiB

Yes No
On this page

Yes No