BACKUP
feature is only available to enterprise users. For non-enterprise backups, see cockroach dump
.CockroachDB's BACKUP
statement allows you to create full or incremental backups of your cluster's schema and data that are consistent as of a given timestamp. Backups can be with or without revision history.
Because CockroachDB is designed with high fault tolerance, these backups are designed primarily for disaster recovery (i.e., if your cluster loses a majority of its nodes) through RESTORE
. Isolated issues (such as small-scale node outages) do not require any intervention.
Functional Details
Backup Targets
You can backup entire tables (which automatically includes their indexes) or views. Backing up a database simply backs up all of its tables and views.
BACKUP
only offers table-level granularity; it does not support backing up subsets of a table.Object Dependencies
Dependent objects must be backed up at the same time as the objects they depend on.
Object | Depends On |
---|---|
Table with foreign key constraints | The table it REFERENCES ; however, this dependency can be removed during the restore. |
Table with a sequence | New in v2.0: The sequence it uses; however, this dependency can be removed during the restore. |
Views | The tables used in the view's SELECT statement. |
Interleaved tables | The parent table in the interleaved hierarchy. |
Users and Privileges
Every backup you create includes system.users
, which stores your users and their passwords. To restore your users, you must use this procedure.
Restored tables inherit privilege grants from the target database; they do not preserve privilege grants from the backed up table because the restoring cluster may have different users.
Table-level privileges must be granted to users after the restore is complete.
Backup Types
CockroachDB offers two types of backups: full and incremental.
Full Backups
Full backups contain an unreplicated copy of your data and can always be used to restore your cluster. These files are roughly the size of your data and require greater resources to produce than incremental backups. You can take full backups as of a given timestamp and (optionally) include the available revision history.
Incremental Backups
Incremental backups are smaller and faster to produce than full backups because they contain only the data that has changed since a base set of backups you specify (which must include one full backup, and can include many incremental backups). You can take incremental backups either as of a given timestamp or with full revision history.
Note the following restrictions:
Incremental backups can only be created within the garbage collection period of the base backup's most recent timestamp. This is because incremental backups are created by finding which data has been created or modified since the most recent timestamp in the base backup––that timestamp data, though, is deleted by the garbage collection process.
You can configure garbage collection periods using the
ttlseconds
replication zone setting.It is not possible to create an incremental backup if one or more tables were created, dropped, or truncated after the full backup. In this case, you must create a new full backup.
Backups with Revision History New in v2.0
You can create full or incremental backups with revision history:
- Taking full backups with revision history allows you to back up every change made within the garbage collection period leading up to and including the given timestamp.
- Taking incremental backups with revision history allows you to back up every change made since the last backup and within the garbage collection period leading up to and including the given timestamp. You can take incremental backups with revision history even when your previous full or incremental backups were taken without revision history.
You can configure garbage collection periods using the ttlseconds
replication zone setting. Taking backups with revision history allows for point-in-time restores within the revision history.
Performance
The BACKUP
process minimizes its impact to the cluster's performance by distributing work to all nodes. Each node backs up only a specific subset of the data it stores (those for which it serves writes; more details about this architectural concept forthcoming), with no two nodes backing up the same data.
For best performance, we also recommend always starting backups with a specific timestamp at least 10 seconds in the past. For example:
> BACKUP...AS OF SYSTEM TIME '2017-06-09 16:13:55.571516+00:00';
This improves performance by decreasing the likelihood that the BACKUP
will be retried because it contends with other statements/transactions. However, because AS OF SYSTEM TIME
returns historical data, your reads might be stale.
Automating Backups
We recommend automating daily backups of your cluster.
To automate backups, you must have a client send the BACKUP
statement to the cluster.
Once the backup is complete, your client will receive a BACKUP
response.
Viewing and Controlling Backups Jobs
After CockroachDB successfully initiates a backup, it registers the backup as a job, which you can view with SHOW JOBS
.
After the backup has been initiated, you can control it with PAUSE JOB
, RESUME JOB
, and CANCEL JOB
.
Synopsis
Required Privileges
Only the root
user can run BACKUP
.
Parameters
Parameter | Description |
---|---|
table_pattern |
The table or view you want to back up. |
name |
The name of the database you want to back up (i.e., create backups of all tables and views in the database). |
destination |
The URL where you want to store the backup. For information about this URL structure, see Backup File URLs. |
AS OF SYSTEM TIME timestamp |
Back up data as it existed as of timestamp . The timestamp must be more recent than your cluster's last garbage collection (which defaults to occur every 25 hours, but is configurable per table). |
WITH revision_history |
New in v2.0: Create a backup with full revision history that records every change made to the cluster within the garbage collection period leading up to and including the given timestamp. |
INCREMENTAL FROM full_backup_location |
Create an incremental backup using the full backup stored at the URL full_backup_location as its base. For information about this URL structure, see Backup File URLs.Note: It is not possible to create an incremental backup if one or more tables were created, dropped, or truncated after the full backup. In this case, you must create a new full backup. |
incremental_backup_location |
Create an incremental backup that includes all backups listed at the provided URLs. Lists of incremental backups must be sorted from oldest to newest. The newest incremental backup's timestamp must be within the table's garbage collection period. For information about this URL structure, see Backup File URLs. For more information about garbage collection, see Configure Replication Zones. |
Backup File URLs
We will use the URL provided to construct a secure API call to the service you specify. The path to each backup must be unique, and the URL for your backup's destination/locations must use the following format:
[scheme]://[host]/[path]?[parameters]
Location | scheme | host | parameters |
---|---|---|---|
Amazon S3 | s3 |
Bucket name | AWS_ACCESS_KEY_ID , AWS_SECRET_ACCESS_KEY |
Azure | azure |
Container name | AZURE_ACCOUNT_KEY , AZURE_ACCOUNT_NAME |
Google Cloud 1 | gs |
Bucket name | AUTH (optional): can be default or implicit |
HTTP 2 | http |
Remote host | N/A |
NFS/Local 3 | nodelocal |
File system location | N/A |
S3-compatible services 4 | s3 |
Bucket name | AWS_ACCESS_KEY_ID , AWS_SECRET_ACCESS_KEY , AWS_REGION , AWS_ENDPOINT |
Considerations
1 If the
AUTH
parameter isimplicit
, all GCS connections use Google's default authentication strategy. If theAUTH
parameter isdefault
, thecloudstorage.gs.default.key
cluster setting must be set to the contents of a service account file which will be used during authentication. If theAUTH
parameter is not specified, thecloudstorage.gs.default.key
setting will be used if it is non-empty, otherwise theimplicit
behavior is used.2 You can easily create your own HTTP server with Caddy or nginx. A custom root CA can be appended to the system's default CAs by setting the
cloudstorage.http.custom_ca
cluster setting, which will be used when verifying certificates from HTTPS URLs.3 The file system backup location on the NFS drive is relative to the path specified by the
--external-io-dir
flag set while starting the node. If the flag is set todisabled
, then imports from local directories and NFS drives are disabled.4 A custom root CA can be appended to the system's default CAs by setting the
cloudstorage.http.custom_ca
cluster setting, which will be used when verifying certificates from an S3-compatible service.The location parameters often contain special characters that need to be URI-encoded. Use Javascript's encodeURIComponent function or Go language's url.QueryEscape function to URI-encode the parameters. Other languages provide similar functions to URI-encode special characters.
Examples
Per our guidance in the Performance section, we recommend starting backups from a time at least 10 seconds in the past using AS OF SYSTEM TIME
.
Backup a Single Table or View
> BACKUP bank.customers \
TO 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
AS OF SYSTEM TIME '2017-03-26 23:59:00';
Backup Multiple Tables
> BACKUP bank.customers, bank.accounts \
TO 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
AS OF SYSTEM TIME '2017-03-26 23:59:00';
Backup an Entire Database
> BACKUP DATABASE bank \
TO 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
AS OF SYSTEM TIME '2017-03-26 23:59:00';
Backup with Revision HistoryNew in v2.0
> BACKUP DATABASE bank \
TO 'gs://acme-co-backup/database-bank-2017-03-27-weekly' \
AS OF SYSTEM TIME '2017-03-26 23:59:00' WITH revision_history;
Create Incremental Backups
Incremental backups must be based off of full backups you've already created.
> BACKUP DATABASE bank \
TO 'gs://acme-co-backup/db/bank/2017-03-29-nightly' \
AS OF SYSTEM TIME '2017-03-28 23:59:00' \
INCREMENTAL FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly', 'gs://acme-co-backup/database-bank-2017-03-28-nightly';
Create Incremental Backups with Revision HistoryNew in v2.0
> BACKUP DATABASE bank \
TO 'gs://acme-co-backup/database-bank-2017-03-29-nightly' \
AS OF SYSTEM TIME '2017-03-28 23:59:00' \
INCREMENTAL FROM 'gs://acme-co-backup/database-bank-2017-03-27-weekly', 'gs://acme-co-backup/database-bank-2017-03-28-nightly' WITH revision_history;