A phased migration to CockroachDB uses the MOLT tools to convert your source schema, incrementally load source data and verify the results, and finally replicate ongoing changes before performing cutover.
Before you begin
- Review the Migration Overview.
- Install the MOLT (Migrate Off Legacy Technology) tools.
- Review the MOLT Fetch setup and best practices.
When exporting data to cloud storage, ensure that access control is properly configured:
If you are using Amazon S3 for cloud storage:
Ensure that the following environment variables are set appropriately in the terminal running
molt fetch
:export AWS_REGION='us-east-1' export AWS_SECRET_ACCESS_KEY='key' export AWS_ACCESS_KEY_ID='id'
Alternatively, set the
--use-implicit-auth
flag to use implicit authentication.Ensure the S3 bucket is created and accessible by authorized roles and users only.
If you are using Google Cloud Storage for cloud storage:
Ensure that your local environment is authenticated using Application Default Credentials:
Using
gcloud
:gcloud init gcloud auth application-default login
Using the environment variable:
export GOOGLE_APPLICATION_CREDENTIALS={path_to_cred_json}
Alternatively, set the
--use-implicit-auth
flag to use implicit authentication.Ensure the Google Cloud Storage bucket is created and accessible by authorized roles and users only.
Select the source dialect you will migrate to CockroachDB:
Step 1. Prepare the source database
Ensure that the PostgreSQL database is configured for replication. Enable logical replication by setting wal_level
to logical
in postgresql.conf
or in the SQL shell. For example:
ALTER SYSTEM SET wal_level = 'logical';
Ensure that the MySQL database is configured for replication.
For MySQL 8.0 and later sources, enable global transaction identifiers (GTID) consistency. Set the following values in mysql.cnf
, in the SQL shell, or as flags in the mysql
start command:
--enforce-gtid-consistency=ON
--gtid-mode=ON
--binlog-row-metadata=full
For MySQL 5.7 sources, set the following values. Note that binlog-row-image
is used instead of binlog-row-metadata
. Set server-id
to a unique integer that differs from any other MySQL server you have in your cluster (e.g., 3
).
--enforce-gtid-consistency=ON
--gtid-mode=ON
--binlog-row-image=full
--server-id={ID}
--log-bin=log-bin
Step 2. Prepare the source schema
CockroachDB supports the PostgreSQL wire protocol and is largely compatible with PostgreSQL syntax. For syntax differences, refer to Features that differ from PostgreSQL.
Convert your database schema to an equivalent CockroachDB schema.
The simplest method is to use the MOLT Schema Conversion Tool to convert your schema line-by-line. The tool accepts
.sql
files and will convert the syntax, identify unimplemented features and syntax incompatibilities in the schema, and suggest edits according to CockroachDB best practices.The Schema Conversion Tool requires a free CockroachDB Cloud account. If this is not an option for you, do one of the following:
- Enable automatic schema creation when loading data with MOLT Fetch. The
--table-handling drop-on-target-and-recreate
option creates one-to-one type mappings between the source database and CockroachDB and works well when the source schema is well-defined. - Manually convert the schema according to the schema design best practices. You can also export a partially converted schema from the Schema Conversion Tool to finish the conversion manually.
For additional help, contact your account team.
- Enable automatic schema creation when loading data with MOLT Fetch. The
Import the converted schema to a CockroachDB cluster.
- When migrating to CockroachDB Cloud, use the Schema Conversion Tool to migrate the converted schema to a new Cloud database.
- When migrating to a self-hosted CockroachDB cluster, pipe the data definition language (DDL) directly into
cockroach sql
. You can export a converted schema file from the Schema Conversion Tool.Tip:For the fastest performance, you can use a local, single-node CockroachDB cluster to convert your schema.
When using the Schema Conversion Tool, syntax that cannot automatically be converted will be displayed in the Summary Report. These may include the following:
String case sensitivity
Strings are case-insensitive in MySQL and case-sensitive in CockroachDB. You may need to edit your MySQL data to get the results you expect from CockroachDB. For example, you may have been doing string comparisons in MySQL that will need to be changed to work with CockroachDB.
For more information about the case sensitivity of strings in MySQL, refer to Case Sensitivity in String Searches from the MySQL documentation. For more information about CockroachDB strings, refer to STRING
.
Identifier case sensitivity
Identifiers are case-sensitive in MySQL and case-insensitive in CockroachDB. When using the Schema Conversion Tool, you can either keep case sensitivity by enclosing identifiers in double quotes, or make identifiers case-insensitive by converting them to lowercase.
AUTO_INCREMENT
attribute
The MySQL AUTO_INCREMENT
attribute, which creates sequential column values, is not supported in CockroachDB. When using the Schema Conversion Tool, columns with AUTO_INCREMENT
can be converted to use sequences, UUID
values with gen_random_uuid()
, or unique INT8
values using unique_rowid()
. Cockroach Labs does not recommend using a sequence to define a primary key column. For more information, refer to Unique ID best practices.
Changing a column type during schema conversion will cause MOLT Verify to identify a type mismatch during data validation. This is expected behavior.
ENUM
type
MySQL ENUM
types are defined in table columns. On CockroachDB, ENUM
is a standalone type. When using the Schema Conversion Tool, you can either deduplicate the ENUM
definitions or create a separate type for each column.
TINYINT
type
TINYINT
data types are not supported in CockroachDB. The Schema Conversion Tool automatically converts TINYINT
columns to INT2
(SMALLINT
).
Geospatial types
MySQL geometry types are not converted to CockroachDB geospatial types by the Schema Conversion Tool. They should be manually converted to the corresponding types in CockroachDB.
FIELD
function
The MYSQL FIELD
function is not supported in CockroachDB. Instead, you can use the array_position
function, which returns the index of the first occurrence of element in the array.
Example usage:
SELECT array_position(ARRAY[4,1,3,2],1);
array_position
------------------
2
(1 row)
While MySQL returns 0 when the element is not found, CockroachDB returns NULL
. So if you are using the ORDER BY
clause in a statement with the array_position
function, the caveat is that sort is applied even when the element is not found. As a workaround, you can use the COALESCE
operator.
SELECT * FROM table_a ORDER BY COALESCE(array_position(ARRAY[4,1,3,2],5),999);
Step 3. Load data into CockroachDB
To optimize performance of data load, Cockroach Labs recommends dropping any constraints and indexes on the target CockroachDB database. You can recreate them after the data is loaded.
Perform an initial load of data into the target database. This can be a subset of the source data that you wish to verify, or it can be the entire dataset.
The following example migrates a single employees
table. The table is exported to an Amazon S3 bucket and imported to CockroachDB using the IMPORT INTO
statement, which is the default MOLT Fetch mode.
IMPORT INTO
takes the target CockroachDB tables offline to maximize throughput. The tables come back online once the import job completes successfully. If you need to keep the target tables online, add the--use-copy
flag to export data withCOPY FROM
instead. For more details, refer to Data movement.If you cannot move data to a public cloud, specify
--direct-copy
instead of--bucket-path
in themolt fetch
command. This flag instructs MOLT Fetch to useCOPY FROM
to move the source data directly to CockroachDB without an intermediate store. For more information, refer to Direct copy.
Issue the MOLT Fetch command to move the source data to CockroachDB, specifying
--mode data-load
to perform a one-time data load. For details on this mode, refer to the MOLT Fetch page.Note:Ensure that the
--source
and--target
connection strings are URL-encoded.Be sure to specify
--pglogical-replication-slot-name
, which is required for replication in Step 6.molt fetch \ --source 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full' \ --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ --table-filter 'employees' \ --bucket-path 's3://molt-test' \ --table-handling truncate-if-exists \ --non-interactive \ --pglogical-replication-slot-name cdc_slot \ --mode data-load
molt fetch \ --source 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \ --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ --table-filter 'employees' \ --bucket-path 's3://molt-test' \ --table-handling truncate-if-exists \ --non-interactive \ --mode data-load
Check the output to observe
fetch
progress.A
starting fetch
message indicates that the task has started:{"level":"info","type":"summary","num_tables":1,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
{"level":"info","type":"summary","num_tables":1,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-28","time":"2025-02-10T14:28:11-05:00","message":"starting fetch"}
data extraction
messages are written for each table that is exported to the location in--bucket-path
:{"level":"info","table":"public.employees","time":"2025-02-10T14:28:11-05:00","message":"data extraction phase starting"}
{"level":"info","table":"public.employees","type":"summary","num_rows":200000,"export_duration_ms":1000,"export_duration":"000h 00m 01s","time":"2025-02-10T14:28:12-05:00","message":"data extraction from source complete"}
data import
messages are written for each table that is loaded into CockroachDB:{"level":"info","table":"public.employees","time":"2025-02-10T14:28:12-05:00","message":"starting data import on target"}
{"level":"info","table":"public.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"0/43A1960","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
{"level":"info","table":"public.employees","type":"summary","net_duration_ms":1899.748333,"net_duration":"000h 00m 01s","import_duration_ms":1160.523875,"import_duration":"000h 00m 01s","export_duration_ms":1000,"export_duration":"000h 00m 01s","num_rows":200000,"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","time":"2025-02-10T14:28:13-05:00","message":"data import on target for table complete"}
A
fetch complete
message is written when the fetch task succeeds:{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":1,"tables":["public.employees"],"cdc_cursor":"0/3F41E40","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
{"level":"info","type":"summary","fetch_id":"f5cb422f-4bb4-4bbd-b2ae-08c4d00d1e7c","num_tables":1,"tables":["public.employees"],"cdc_cursor":"4c658ae6-e8ad-11ef-8449-0242ac140006:1-29","net_duration_ms":6752.847625,"net_duration":"000h 00m 06s","time":"2024-03-18T12:30:37-04:00","message":"fetch complete"}
Step 4. Verify the data load
Use MOLT Verify to validate the consistency of the data between the source database and CockroachDB.
molt verify \ --source 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full' \ --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ --table-filter 'employees' \
molt verify \ --source 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \ --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ --table-filter 'employees'
Check the output to observe
verify
progress.A
verification in progress
indicates that the task has started:{"level":"info","time":"2025-02-10T15:35:04-05:00","message":"verification in progress"}
starting verify
messages are written for each specified table:{"level":"info","time":"2025-02-10T15:35:04-05:00","message":"starting verify on public.employees, shard 1/1"}
A
finished row verification
message containing a summary is written after each table is compared. For details on the summary fields, refer to the MOLT Verify page.{"level":"info","type":"summary","table_schema":"public","table_name":"employees","num_truth_rows":200004,"num_success":200004,"num_conditional_success":0,"num_missing":0,"num_mismatch":0,"num_extraneous":0,"num_live_retry":0,"num_column_mismatch":0,"time":"2025-02-10T15:35:05-05:00","message":"finished row verification on public.employees (shard 1/1)"}
A
verification complete
message is written when the verify task succeeds:{"level":"info","net_duration_ms":699.804875,"net_duration":"000h 00m 00s","time":"2025-02-10T15:35:05-05:00","message":"verification complete"}
Repeat Step 3 and Step 4 to migrate any remaining tables.
Step 5. Modify the CockroachDB schema
If you need the best possible replication performance, you can perform this step right before cutover.
You can now add any constraints or indexes that you previously removed from the CockroachDB schema.
For the appropriate SQL syntax, refer to ALTER TABLE ... ADD CONSTRAINT
and CREATE INDEX
. Review the best practices for creating secondary indexes on CockroachDB.
Step 6. Replicate changes to CockroachDB
With initial load complete, start replication of ongoing changes on the source to CockroachDB.
The following example specifies that the employees
table should be watched for change events.
Issue the MOLT Fetch command to start replication on CockroachDB, specifying
--mode replication-only
to replicate ongoing changes on the source to CockroachDB. For details on this mode, refer to the MOLT Fetch page.Be sure to specify the same
--pglogical-replication-slot-name
value that you provided in Step 3.molt fetch \ --source 'postgres://postgres:postgres@localhost:5432/molt?sslmode=verify-full' \ --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ --table-filter 'employees' \ --non-interactive \ --mode replication-only \ --pglogical-replication-slot-name cdc_slot
Use the
--defaultGTIDSet
replication flag to specify the GTID set. To find your GTID record, runSELECT source_uuid, min(interval_start), max(interval_end) FROM mysql.gtid_executed GROUP BY source_uuid;
on MySQL.molt fetch \ --source 'mysql://user:password@localhost/molt?sslcert=.%2fsource_certs%2fclient.root.crt&sslkey=.%2fsource_certs%2fclient.root.key&sslmode=verify-full&sslrootcert=.%2fsource_certs%2fca.crt' \ --target 'postgres://root@localhost:26257/defaultdb?sslmode=verify-full' \ --table-filter 'employees' \ --non-interactive \ --mode replication-only \ --replicator-flags '--defaultGTIDSet 4c658ae6-e8ad-11ef-8449-0242ac140006:1-29'
Check the output to observe
replicator
progress.A
starting replicator
message indicates that the task has started:{"level":"info","time":"2025-02-10T14:28:13-05:00","message":"starting replicator"}
The
staging database name
message contains the name of the staging schema:{"level":"info","time":"2025-02-10T14:28:13-05:00","message":"staging database name: _replicator_1739215693817700000"}
The staging schema provides a replication marker for streaming changes. You will need the staging schema name in case replication fails and must be resumed, or failback to the source database is performed.
upserted rows
log messages indicate that changes were replicated to CockroachDB:DEBUG [Jan 22 13:52:40] upserted rows conflicts=0 duration=7.620208ms proposed=1 target="\"molt\".\"public\".\"employees\"" upserted=1
Step 7. Stop replication and verify data
Stop application traffic to your source database. This begins downtime.
Wait for replication to drain, which means that all transactions that occurred on the source database have been fully processed and replicated to CockroachDB. When replication has fully drained, you will not see new
upserted rows
logs.Cancel replication to CockroachDB by entering
ctrl-c
to issue aSIGTERM
signal. This returns an exit code0
.Repeat Step 4 to verify the updated data.
If you encountered issues with replication, you can now use failback
mode to replicate changes on CockroachDB back to the initial source database. In case you need to roll back the migration, this ensures that data is consistent on the initial source database.
Step 8. Cutover
Perform a cutover by resuming application traffic, now to CockroachDB.