Scale to Multiple Regions

On this page Carat arrow pointing down

This page provides guidance for scaling a single-region application to multiple regions.

Before reading this page, we recommend reviewing CockroachDB's multi-region capabilities.

Overview

Scaling an application from a single region to multiple regions consists of:

  • Scaling the database, which includes adding new nodes to a CockroachDB cluster in different regions, adding regions to the database schema, and optionally transforming the database schema to leverage multi-region table localities.

  • Scaling the application, which includes deploying the application in new regions and, if necessary, updating the application code to work with the multi-region database schema.

Scale the database

Step 1. Prep the database

Use an ALTER DATABASE ... SET PRIMARY REGION statement to set the database's primary region to a region in which the cluster is deployed. This region must have been specified as a regional locality at cluster startup.

Setting the primary region before adding new regional nodes to the cluster prevents CockroachDB from rebalancing row replications across all regions each time a node is added in a new region.

Note:

Executing ALTER statements performs a schema migration on the cluster. If you are using a schema migration tool, you will need to execute these statements as raw SQL, as the multi-region SQL syntax is specific to CockroachDB.

Here are some simple tutorials on executing schema migrations against CockroachDB clusters:

Step 2. Scale the cluster deployment

Scale the cluster by adding nodes to the cluster in new regions.

For instructions on adding nodes to an existing cluster, see one of the following pages:

Note:

For orchestrated and manual deployments, you must specify a regional locality for each node at startup. These regional localities are represented as cluster regions in the cluster.

Step 3. Scale the database schema

Use an ALTER DATABASE ... ADD REGIONS statement to add the new regions to your database. Only cluster regions (i.e., regional localities specified at cluster startup) can be added as database regions.

After you add new regions to the database schema, you can optionally configure the survival goals and table localities of the multi-region database:

Scale the application

Step 1. Scale application deployments

Scaling application deployments in multiple regions can greatly improve latency for the end-user of the application.

For guidance on connecting to CockroachDB from an application deployment, see one of the following pages:

To limit the latency between the application and the database, each deployment of the application should communicate with the closest database deployment. For details on configuring database connections for individual application deployments, consult your cloud provider's documentation. For an example using Google Cloud services, see Multi-region Application Deployment.

Note:

A multi-region application deployment does not require a multi-region database deployment. Deploying a global application in multiple regions can yield significant latency benefits for the end user, even if you have not yet scaled your database in multiple regions. For an example, see Reducing Multi-Region Latency with Follower Reads.

If you do scale the application first, make sure that you reconfigure each application deployment to communicate with the closest database deployment after deploying the database in multiple regions.

Step 2. (Optional) Update the application code for multi-region

For most table localities, including the default locality LOCALITY REGIONAL BY TABLE IN PRIMARY REGION, you do not need to update your application code after migrating your database schema for multi-region. CockroachDB automatically optimizes queries against multi-region databases, based on the regional locality of the node executing the query, and on the multi-region configuration of the database. For more details, see Regional Tables. For an extended example, see Develop and Deploy a Global Application: Create a Multi-region Database Schema.

However, there are some scenarios in which you might need to update the SQL operations in your application. For example:

In all of these scenarios, statements reference the column that tracks the region for each row in a REGIONAL BY ROW locality. This column can be a custom column of the built-in ENUM type crdb_internal_region, or it can be the default, hidden crdb_region column.

If you need to explicitly reference the region-tracking column in a SQL operation in your application code, you should do the following:

  • Verify that the region-tracking column is visible to the ORM.

    To make a hidden column visible, use an ALTER TABLE ... ALTER COLUMN ... SET VISIBLE statement. By default, the crdb_region column created by CockroachDB is hidden.

  • Using your ORM framework, sync the mapping objects in your application to reflect the latest database schema with the region-tracking column(s).

  • Reference the region-tracking column in read/write operations as needed.

For example, suppose that you have a single-region table called users that has just been transformed into a multi-region table with a REGIONAL BY ROW locality. When the application was first deployed, this table had no region-tracking column. During the multi-region database schema transformation, CockroachDB automatically created a hidden crdb_region column to track the region of each row.

In the absence of an explicit, back-filling computed column for the hidden crdb_region column, there is no way for CockroachDB to determine the region for old rows of data. The following steps update the crdb_region values in rows that were inserted before the multi-region transformation, based on the values of a city column:

  1. Make crdb_region visible in the relevant REGIONAL BY ROW table(s):

    icon/buttons/copy
    ALTER TABLE users ALTER COLUMN crdb_region SET VISIBLE;
    
  2. Update the table mappings in the application code (written in Python, with SQLAlchemy):

    icon/buttons/copy
    from models import Base
    
    ...
    
    Base.metadata.reflect(bind=self.engine, extend_existing=True, autoload_replace=True)
    
    Note:

    SQLAlchemy allows you to update all table mappings to reflect the database with the sqlalchemy.schema.MetaData class method reflect(). If your ORM framework does not support updating mapping objects dynamically, you might need to add the column to the table-mapping class definition as a String-typed column and reinstantiate the object.

  3. Reference the column value as needed.

    Here is an example function that updates the region value in a given table, using the values of the city column:

    icon/buttons/copy
    from sqlalchemy_cockroachdb import run_transaction
    ...
    
    def update_region(engine, table, region, cities):
    
        def update_region_helper(session, table, region, cities):
            query = table.update().where(column('city').in_(cities)).values({'crdb_region': region})
            session.execute(query)
    
        run_transaction(sessionmaker(bind=engine),
                        lambda session: update_region_helper(session, table, region, cities))
    

See also


Yes No
On this page

Yes No