Physical cluster replication is only supported in CockroachDB self-hosted clusters.
CockroachDB physical cluster replication (PCR) continuously sends all data at the byte level from a primary cluster to an independent standby cluster. Existing data and ongoing changes on the active primary cluster, which is serving application data, replicate asynchronously to the passive standby cluster.
In a disaster recovery scenario, you can fail over from the unavailable primary cluster to the standby cluster. This will stop the replication stream, reset the standby cluster to a point in time where all ingested data is consistent, and mark the standby as ready to accept application traffic.
For a list of requirements for PCR, refer to the Before you begin section of the setup tutorial.
Use cases
You can use PCR in a disaster recovery plan to:
- Meet your RTO (Recovery Time Objective) and RPO (Recovery Point Objective) requirements. PCR provides lower RTO and RPO than backup and restore.
- Automatically replicate everything in your primary cluster to recover quickly from a control plane or full cluster failure.
- Protect against region failure when you cannot use individual multi-region clusters — for example, if you have a two-datacenter architecture and do not have access to three regions; or, you need low-write latency in a single region. PCR allows for an active-passive (primary-standby) structure across two clusters with the passive cluster in a different region.
- Avoid conflicts in data after recovery; the replication completes to a transactionally consistent state as of a certain point in time.
Features
- Asynchronous byte-level replication: When you initiate a replication stream, it will replicate byte-for-byte all of the primary cluster's existing user data and associated metadata to the standby cluster asynchronously. From then on, it will continuously replicate the primary cluster's data and metadata to the standby cluster. PCR will automatically replicate changes related to operations such as schema changes, user and privilege modifications, and zone configuration updates without any manual work.
- Transactional consistency: You can fail over to the standby cluster at the
LATEST
timestamp or a point of time in the past or the future. When the failover process completes, the standby cluster will be in a transactionally consistent state as of the point in time you specified. - Maintained/improved RPO and RTO: Depending on workload and deployment configuration, replication lag between the primary and standby is generally in the tens-of-seconds range. The failover process from the primary cluster to the standby should typically happen within five minutes when completing a failover to the latest replicated time using
LATEST
. - Failover to a timestamp in the past or the future: In the case of logical disasters or mistakes, you can fail over from the primary to the standby cluster to a timestamp in the past. This means that you can return the standby to a timestamp before the mistake was replicated to the standby. You can also configure the
WITH RETENTION
option to control how far in the past you can fail over to. Furthermore, you can plan a failover by specifying a timestamp in the future. - Monitoring: To monitor the replication's initial progress, current status, and performance, you can use metrics available in the DB Console and Prometheus. For more detail, refer to Physical Cluster Replication Monitoring.
Failing over to a timestamp in the past involves reverting data on the standby cluster. As a result, this type of failover takes longer to complete than failover to the latest replicated time. The increase in failover time will correlate to how much data you are reverting from the standby. For more detail, refer to the Technical Overview page for PCR.
Known limitations
- Physical cluster replication is supported only on CockroachDB self-hosted in new clusters on v23.2 or above. Physical Cluster Replication cannot be enabled on clusters that have been upgraded from a previous version of CockroachDB.
- Read queries are not supported on the standby cluster before failover.
- The primary and standby cluster cannot have different region topology. For example, replicating a multi-region primary cluster to a single-region standby cluster is not supported. Mismatching regions between a multi-region primary and standby cluster is also not supported.
- Failing back to the primary cluster after a failover is a manual process. Refer to Fail back to the primary cluster. In addition, after failover, to continue using physical cluster replication, you must configure it again.
- Before failover to the standby, the standby cluster does not support running backups or changefeeds.
Large data imports, such as those produced by
RESTORE
orIMPORT INTO
, may dramatically increase replication lag.After the failover process for physical cluster replication, scheduled changefeeds will continue on the promoted cluster. You will need to manage pausing or canceling the schedule on the promoted standby cluster to avoid two clusters running the same changefeed to one sink. #123776
After a failover, there is no mechanism to stop applications from connecting to the original primary cluster. It is necessary to redirect application traffic manually, such as by using a network load balancer or adjusting DNS records.
Performance
Cockroach Labs testing has demonstrated the following results for workloads up to the outlined scale:
- Initial data load: 30TB
- 100,000 writes per second
- Replication lag (steady state, no bulk changes): 20–45 seconds
- Failover: 2–5 minutes
Frequent large schema changes or imports may cause a significant spike in replication lag.
Get started
This section is a quick overview of the initial requirements to start a replication stream.
For more comprehensive guides, refer to:
- Technical Overview: to understand PCR in more depth before setup.
- Set Up Physical Cluster Replication: for a tutorial on how to start a replication stream.
- Physical Cluster Replication Monitoring: for detail on metrics and observability into a replication stream.
- Fail Over from a Primary Cluster to a Standby Cluster: for a guide on how to complete a replication stream and fail over to the standby cluster.
Start clusters
Before starting PCR, ensure that the standby cluster is at the same version, or one version ahead of, the primary cluster. For more details, refer to Cluster versions and upgrades.
To use PCR on clusters, you must initialize the primary and standby CockroachDB clusters with the --virtualized
and --virtualized-empty
flags respectively. This enables cluster virtualization and sets up each cluster ready for replication.
The active primary cluster that serves application traffic:
cockroach init ... --virtualized
The passive standby cluster that will ingest the replicated data:
cockroach init ... --virtualized-empty
The node topology of the two clusters does not need to be the same. For example, you can provision the standby cluster with fewer nodes. However, consider that:
- The standby cluster requires enough storage to contain the primary cluster's data.
- During a failover scenario, the standby will need to handle the full production load. However, the clusters cannot have different region topologies (refer to Limitations).
Every node in the standby cluster must be able to make a network connection to every node in the primary cluster to start a replication stream successfully. Refer to Manage the cluster certificates for details.
Connect to the system virtual cluster and virtual cluster
A cluster with PCR enabled is a virtualized cluster; the primary and standby clusters each contain:
- The system virtual cluster manages the cluster's control plane and the replication of the cluster's data. Admins connect to the system virtual cluster to configure and manage the underlying CockroachDB cluster, set up PCR, create and manage a virtual cluster, and observe metrics and logs for the CockroachDB cluster and each virtual cluster.
- Each other virtual cluster manages its own data plane. Users connect to a virtual cluster by default, rather than the system virtual cluster. To connect to the system virtual cluster, the connection string must be modified. Virtual clusters contain user data and run application workloads. When PCR is enabled, the non-system virtual cluster on both primary and secondary clusters is named
main
.
To connect to a virtualized cluster using the SQL shell:
For the system virtual cluster, include the
options=-ccluster=system
parameter in thepostgresql
connection URL:cockroach sql --url "postgresql://root@{your IP or hostname}:26257?options=-ccluster=system&sslmode=verify-full" --certs-dir "certs"
For the virtual cluster, include the
options=-ccluster=main
parameter in thepostgresql
connection URL:cockroach sql --url "postgresql://root@{your IP or hostname}:26257?options=-ccluster=main&sslmode=verify-full" --certs-dir "certs"
PCR requires an Enterprise license on the primary and standby clusters. You must set Enterprise licenses from the system virtual cluster.
To connect to the DB Console and view the Physical Cluster Replication dashboard, the user must have the correct privileges. Refer to Create a user for the standby cluster.
Manage replication in the SQL shell
To start, manage, and observe PCR, you can use the following SQL statements:
Statement | Action |
---|---|
CREATE VIRTUAL CLUSTER ... FROM REPLICATION OF ... |
Start a replication stream. |
ALTER VIRTUAL CLUSTER ... PAUSE REPLICATION |
Pause a running replication stream. |
ALTER VIRTUAL CLUSTER ... RESUME REPLICATION |
Resume a paused replication stream. |
ALTER VIRTUAL CLUSTER ... START SERVICE SHARED |
Initiate a failover. |
SHOW VIRTUAL CLUSTER |
Show all virtual clusters. |
DROP VIRTUAL CLUSTER |
Remove a virtual cluster. |
Cluster versions and upgrades
The standby cluster must be at the same version as, or one version ahead of, the primary's virtual cluster.
When PCR is enabled, upgrade with the following procedure. This upgrades the standby cluster before the primary cluster. Within the primary and standby CockroachDB clusters, the system virtual cluster must be at a cluster version greater than or equal to the virtual cluster:
- Upgrade the binaries on the primary and standby clusters. Replace the binary on each node of the cluster and restart the node.
- Finalize the upgrade on the standby's system virtual cluster if auto-finalization is disabled.
- Finalize the upgrade on the primary's system virtual cluster if auto-finalization is disabled.
- Finalize the upgrade on the standby's virtual cluster.
- Finalize the upgrade on the primary's virtual cluster.
The standby cluster must be at the same version as, or one version ahead of, the primary's virtual cluster at the time of failover.
Demo video
Learn how to use PCR to meet your RTO and RPO requirements with the following demo: