The CREATE SCHEDULE FOR BACKUP
statement creates a schedule for periodic backups.
For more information about creating, managing, monitoring, and restoring from a scheduled backup, see Manage a Backup Schedule.
Required privileges
Starting in v22.2, CockroachDB introduces a new system-level privilege model that provides finer control over a user's privilege to work with the database, including taking backups.
There is continued support for the legacy privilege model for backups in v22.2, however it will be removed in a future release of CockroachDB. We recommend implementing the new privilege model that follows in this section for all new and existing backups.
You can grant the BACKUP
privilege to a user or role depending on the type of backup:
Backup | Privilege |
---|---|
Cluster | Grant a user the BACKUP system-level privilege. For example, GRANT SYSTEM BACKUP TO user; . |
Database | Grant a user the BACKUP privilege on the target database. For example, GRANT BACKUP ON DATABASE test_db TO user; . |
Table | Grant a user the BACKUP privilege at the table level. This gives the user the privilege to back up the schema and all user-defined types that are associated with the table. For example, GRANT BACKUP ON TABLE test_db.table TO user; . |
The listed privileges do not cascade to objects lower in the schema tree. For example, if you are granted database-level BACKUP
privileges, this does not give you the privilege to back up a table. If you need the BACKUP
privilege on a database to apply to all newly created tables in that database, use DEFAULT PRIVILEGES
. You can add BACKUP
to the user or role's default privileges with ALTER DEFAULT PRIVILEGES
.
You can grant the BACKUP
privilege to a user or role without the SELECT
privilege on a table. As a result, these users will be able to take backups, but they will not be able to run a SELECT
query on that data directly. However, these users could still read this data indirectly, by restoring it from any backups they produce.
Members of the admin
role can run all three types of backups (cluster, database, and table) without the need to grant a specific BACKUP
privilege. However, we recommend using the BACKUP
privilege model to create users or roles and grant them BACKUP
privileges as necessary for stronger access control.
Privileges for managing a backup job
To manage a backup job with PAUSE JOB
, RESUME JOB
, or CANCEL JOB
, users must have at least one of the following:
- Be a member of the
admin
role. - The
CONTROLJOB
role option.
To view a backup job with SHOW JOB
, users must have at least one of the following:
- New in v23.1:
The
VIEWJOB
privilege, which allows you to view all jobs (includingadmin
-owned jobs). - Be a member of the
admin
role. - The
CONTROLJOB
role option.
See GRANT
for detail on granting privileges to a role or user.
Required privileges using the legacy privilege model
The following details the legacy privilege model that CockroachDB supports in v22.2 and earlier. Support for this privilege model will be removed in a future release of CockroachDB:
- Full cluster backups can only be run by members of the
admin
role. By default, theroot
user belongs to theadmin
role. - For all other backups, the user must have read access on all objects being backed up. Database backups require
CONNECT
privileges, and table backups requireSELECT
privileges. Backups of user-defined schemas, or backups containing user-defined types, requireUSAGE
privileges.
See the Required privileges section for the updated privilege model.
Destination privileges
You can grant a user the EXTERNALIOIMPLICITACCESS
system-level privilege.
Either the EXTERNALIOIMPLICITACCESS
system-level privilege or the admin
role is required for the following scenarios:
- Interacting with a cloud storage resource using
IMPLICIT
authentication. - Using a custom endpoint on S3.
- Using the
cockroach nodelocal upload
command.
No special privilege is required for:
- Interacting with an Amazon S3 and Google Cloud Storage resource using
SPECIFIED
credentials. Azure Storage is alwaysSPECIFIED
by default. - Using Userfile storage.
We recommend using cloud storage. You also need to ensure that the permissions at your storage destination are configured for the operation. See Storage Permissions for a list of the necessary permissions that each bulk operation requires.
Synopsis
CREATE SCHEDULE [IF NOT EXISTS] <label>
FOR BACKUP [<targets>] INTO <location>
[WITH <backup_options>[=<value>] [, ...]]
RECURRING [crontab] [FULL BACKUP <crontab|ALWAYS>]
[WITH SCHEDULE OPTIONS <schedule_option>[= <value>] [, ...] ]
Targets:
Empty targets list: backup full cluster.
TABLE <table_pattern> [, ...]
DATABASE <database_name> [, ...]
Parameters
Parameter | Description |
---|---|
IF NOT EXISTS |
Use to specify that a scheduled backup should not be created if the label already exists. Produces an error if the schedule label already exists, or if the label is not specified. |
label |
The name used to identify the backup schedule. This is optional and does not need to be unique. If not provided, the schedule will be assigned the name BACKUP . |
targets |
The targets you want to back up: |
table_pattern |
The table(s) or view(s) you want to back up. |
database_name |
The name of the database(s) you want to back up (i.e., create backups of all tables and views in the database). |
location |
The URI where you want to store the backup. The backup files will be stored in year > month > day subdirectories. The location can be cloud storage, or nodelocal .Note: If you want to schedule a backup using temporary credentials, we recommend that you use implicit authentication; otherwise, you'll need to drop and then recreate schedules each time you need to update the credentials. |
backup_options |
Control the backup behavior with a comma-separated list of options. |
RECURRING crontab |
Specifies when the backup should be taken. A separate schedule may be created automatically to write full backups at a regular cadence, depending on the frequency of the incremental backups. You can likewise modify this separate schedule with ALTER BACKUP SCHEDULE . The schedule is specified as a STRING in crontab format. All times in UTC. Example: '@daily' (run daily at midnight) |
FULL BACKUP crontab |
Specifies when to take a new full backup. The schedule is specified as a STRING in crontab format or as ALWAYS . If FULL BACKUP ALWAYS is specified, then the backups triggered by the RECURRING clause will always be full backups. ALWAYS is the only accepted value of FULL BACKUP .If the FULL BACKUP clause is omitted, CockroachDB will default to the following full backup schedule:
|
WITH SCHEDULE OPTIONS schedule_option |
Control the schedule behavior with a comma-separated list of these options. |
For schedules that include both full and incremental backups, CockroachDB will create two schedules (one for each type). See Incremental backup schedules for more information.
Backup options
Option | Value | Description |
---|---|---|
revision_history |
BOOL / None |
Create a backup with full revision history, which records every change made to the cluster within the garbage collection period leading up to and including the given timestamp. You can specify a backup with revision history without any value e.g., WITH revision_history . Or, you can explicitly define WITH revision_history = 'true' / 'false' . The revision_history option defaults to true when used with BACKUP or CREATE SCHEDULE FOR BACKUP . A value is required when using ALTER BACKUP SCHEDULE to alter a backup schedule. |
encryption_passphrase |
STRING |
The passphrase used to encrypt the files (BACKUP manifest and data files) that the BACKUP statement generates. This same passphrase is needed to decrypt the file when it is used to restore and to list the contents of the backup when using SHOW BACKUP . There is no practical limit on the length of the passphrase. |
detached |
BOOL / None |
Note: Backups running on a schedule have the detached option applied implicitly. Therefore, you cannot modify this option for scheduled backups. When a backup runs in detached mode, it will execute asynchronously. The job ID will be returned after the backup job creation completes. Note that with detached specified, further job information and the job completion status will not be returned. For more on the differences between the returned job data, see the example. To check on the job status, use the SHOW JOBS statement. |
New in v23.1: EXECUTION LOCALITY |
Key-value pairs | Restricts the execution of the backup to nodes that match the defined locality filter requirements. For example, WITH EXECUTION LOCALITY = 'region=us-west-1a,cloud=aws' . Refer to Take Locality-restricted backups for usage and reference detail. |
kms |
STRING |
The URI of the cryptographic key stored in a key management service (KMS), or a comma-separated list of key URIs, used to take and restore encrypted backups. Refer to URI Formats. The key or keys are used to encrypt the manifest and data files that the BACKUP statement generates and to decrypt them during a restore operation, and to list the contents of the backup when using SHOW BACKUP . AWS KMS, Google Cloud KMS, and Azure Key Vault are supported. |
incremental_location |
STRING |
Create an incremental backup in a different location than the default incremental backup location. WITH incremental_location = 'explicit_incrementals_URI' See Incremental backups with explicitly specified destinations for usage. |
Schedule options
Option | Value | Description |
---|---|---|
first_run |
TIMESTAMPTZ / now |
Execute the schedule at the specified time in the future. If not specified, the default behavior is to execute the schedule based on its next RECURRING time. |
on_execution_failure |
retry / reschedule / pause |
If an error occurs during the backup execution, do the following:
Default: reschedule |
on_previous_running |
start / skip / wait |
If the previous backup started by the schedule is still running, do the following:
Default: wait . The option affects backups started by the full backup schedule only. Incremental backups are always set to wait . |
ignore_existing_backups |
N/A | If backups were already created in the destination that the new schedule references, this option must be passed to acknowledge that the new schedule may be backing up different objects. |
updates_cluster_last_backup_time_metric |
N/A | (admin privileges required) When set during backup schedule creation, this option updates the schedules_backup_last_completed_time metric for the scheduled backup. |
Considerations
- We recommend that you schedule your backups at a cadence that your cluster can keep up with; for example, if a previous backup is still running when it is time to start the next one, adjust the schedule so the backups do not end up falling behind or update the
on_previous_running
option. - To prevent scheduled backups from falling behind, first determine how long a single backup takes and use that as your starting point for the schedule's cadence.
- Ensure you are monitoring your backup schedule (e.g., Prometheus) and alerting metrics that will confirm that your backups are completing, but also that they're not running more concurrently than you expect.
- The
AS OF SYSTEM TIME
clause cannot be set on scheduled backups. Scheduled backups are started shortly after the scheduled time has passed by an internal polling mechanism and are automatically run withAS OF SYSTEM TIME
set to the time at which the backup was scheduled to run. - If you want to schedule a backup using temporary credentials, we recommend that you use
implicit
authentication; otherwise, you'll need to drop and then recreate schedules each time you need to update the credentials.
Protected timestamps and scheduled backups
Scheduled backups ensure that the data to be backed up is protected from garbage collection until it has been successfully backed up. This active management of protected timestamps means that you can run scheduled backups at a cadence independent from the GC TTL of the data. This is unlike non-scheduled backups that are tightly coupled to the GC TTL. See Garbage collection and backups for more detail.
The data being backed up will not be eligible for garbage collection until a successful backup completes. At this point, the schedule will release the existing protected timestamp record and write a new one to protect data for the next backup that is scheduled to run. It is important to consider that when a scheduled backup fails there will be an accumulation of data until the next successful backup. Resolving the backup failure or dropping the backup schedule will make the data eligible for garbage collection once again.
You can also use the exclude_data_from_backup
option with a scheduled backup as a way to prevent protected timestamps from prolonging garbage collection on a table. See the example Exclude a table's data from backups for usage information.
We recommend monitoring your backup schedule to alert for failed backups:
- See the Backup and Restore Monitoring page for a general overview and list of metrics available for backup, scheduled backup, and restore jobs.
- See Set up monitoring for the backup schedule for metrics and monitoring backup schedules specifically.
Incremental backup schedules
The incremental backup schedule is created in a paused state, and is only un-paused on completion of the first, scheduled full backup. This ensures that the first incremental backup is only executed once it has a full backup to build a chain from. Thereafter, the incremental backups are scheduled to run at its specified cadence.
Incremental backups always append to the latest, complete full backup. An incremental backup can run concurrently with a full backup, but in such a situation it will continue to append to the previous full backup that has already completed.
An incremental backup will always wait for another incremental backup started by the same schedule to complete before running. This prevents incremental backups from backing up overlapping spans of time in the same backup chain. To enforce this, backup schedules created or altered using the on_previous_running
option will have the full backup schedule created with the user specified option, but will always default the incremental backup schedule option to on_previous_running = wait
.
View and control backup schedules
Once a backup schedule is successfully created, you can do the following:
Action | SQL Statement |
---|---|
View the schedule | SHOW SCHEDULES |
Pause the schedule | PAUSE SCHEDULES |
Resume the schedule | RESUME SCHEDULES |
Drop the schedule | DROP SCHEDULES |
Alter the schedule | ALTER BACKUP SCHEDULE |
View and control a backup initiated by a schedule
After CockroachDB successfully initiates a scheduled backup, it registers the backup as a job. You can do the following with each individual backup job:
Action | SQL Statement |
---|---|
View the backup status | SHOW JOBS |
Pause the backup | PAUSE JOB |
Resume the backup | RESUME JOB |
Cancel the backup | CANCEL JOB |
You can also visit the Jobs page of the DB Console to view job details. The BACKUP
statement will return when the backup is finished or if it encounters an error.
Examples
Create a schedule for full backups only
To schedule full backups of clusters, databases, or tables, use the FULL BACKUP ALWAYS
clause, for example:
> CREATE SCHEDULE core_schedule_label
FOR BACKUP INTO 's3://test/schedule-test-core?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
RECURRING '@daily'
FULL BACKUP ALWAYS
WITH SCHEDULE OPTIONS first_run = 'now';
schedule_id | name | status | first_run | schedule | backup_stmt
---------------------+---------------------+--------+---------------------------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
588799238330220545 | core_schedule_label | ACTIVE | 2020-09-11 00:00:00+00:00 | @daily | BACKUP INTO 's3://test/schedule-test-core?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH detached
(1 row)
Create a scheduled backup for a cluster
This example creates a schedule for a cluster backup with revision history that's taken every day at midnight:
> CREATE SCHEDULE schedule_label
FOR BACKUP INTO 's3://test/backups/schedule_test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
WITH revision_history
RECURRING '@daily';
schedule_id | name | status | first_run | schedule | backup_stmt
---------------------+----------------+------------------------------------------------+----------------------------------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------------
588796190000218113 | schedule_label | PAUSED: Waiting for initial backup to complete | NULL | @daily | BACKUP INTO LATEST IN 's3://test/schedule-test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
588796190012702721 | schedule_label | ACTIVE | 2020-09-10 16:52:17.280821+00:00 | @weekly | BACKUP INTO 's3://test/schedule-test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
(2 rows)
Because the FULL BACKUP
clause is not included, CockroachDB also scheduled a full backup to run @weekly
. This is the default cadence for incremental backups RECURRING
> 1 hour but <= 1 day.
When you run a cluster backup, it will include any backup schedules that are running on the cluster. After a cluster restore, you must manually pause backup schedules either on the restored cluster or the original backed-up cluster (if both clusters are still active) to avoid a backup collision in the same storage bucket.
Create a scheduled backup for a database
This example creates a schedule for a backup of the database movr
with revision history that's taken every day 1 minute past midnight (00:00:01
):
> CREATE SCHEDULE schedule_database
FOR BACKUP DATABASE movr INTO 's3://test/schedule-database?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
WITH revision_history
RECURRING '1 0 * * *';
schedule_id | name | status | first_run | schedule | backup_stmt
---------------------+-------------------+------------------------------------------------+----------------------------------+-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------
588819866656997377 | schedule_database | PAUSED: Waiting for initial backup to complete | NULL | 1 0 * * * | BACKUP DATABASE movr INTO LATEST IN 's3://test/schedule-database?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
588819866674233345 | schedule_database | ACTIVE | 2020-09-10 18:52:42.823003+00:00 | @weekly | BACKUP DATABASE movr INTO 's3://test/schedule-database?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
(2 rows)
Because the FULL BACKUP
clause is not included, CockroachDB also scheduled a full backup to run @weekly
. This is the default cadence for incremental backups RECURRING
> 1 hour but <= 1 day.
Create a scheduled backup for a table
This example creates a schedule for a backup of the table movr.vehicles
with revision history that's taken every hour:
> CREATE SCHEDULE schedule_table
FOR BACKUP TABLE movr.vehicles INTO 's3://test/schedule-table?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
WITH revision_history
RECURRING '@hourly';
schedule_id | name | status | first_run | schedule | backup_stmt
---------------------+----------------+------------------------------------------------+----------------------------------+----------+------------------------------------------------------------------------------------------------------------------------------------------------------
588820615348027393 | schedule_table | PAUSED: Waiting for initial backup to complete | NULL | @hourly | BACKUP TABLE movr.vehicles INTO LATEST IN 's3://test/schedule-table?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
588820615382302721 | schedule_table | ACTIVE | 2020-09-10 18:56:31.305782+00:00 | @daily | BACKUP TABLE movr.vehicles INTO 's3://test/schedule-table?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
(2 rows)
Because the FULL BACKUP
clause is not included, CockroachDB also scheduled a full backup to run @daily
. This is the default cadence for incremental backups RECURRING
<= 1 hour.
Create a scheduled backup with a scheduled first run
This example creates a schedule for a backup of the table movr.vehicles
with revision history that's taken every hour, with its first run scheduled for 2020-09-15 00:00:00.00
(UTC):
> CREATE SCHEDULE scheduled_first_run
FOR BACKUP TABLE movr.vehicles INTO 's3://test/schedule-table?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
WITH revision_history
RECURRING '@hourly'
WITH SCHEDULE OPTIONS first_run = '2020-09-15 00:00:00.00';
schedule_id | name | status | first_run | schedule | backup_stmt
---------------------+---------------------+------------------------------------------------+---------------------------+----------+----------------------------------------------------------------------------------------------------------------------------------------------------------
589963390457741313 | scheduled_first_run | PAUSED: Waiting for initial backup to complete | NULL | @hourly | BACKUP TABLE movr.vehicles INTO LATEST IN 's3://test/scheduled-first-run?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
589963390487363585 | scheduled_first_run | ACTIVE | 2020-09-15 00:00:00+00:00 | @daily | BACKUP TABLE movr.vehicles INTO 's3://test/scheduled-first-run?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH revision_history, detached
(2 rows)
Because the FULL BACKUP
clause is not included, CockroachDB also scheduled a full backup to run @daily
. This is the default cadence for incremental backups RECURRING
<= 1 hour.
Create a scheduled backup with schedule options
This example creates a schedule for a cluster backup with the on_previous_running
option:
> CREATE SCHEDULE schedule_option
FOR BACKUP INTO 's3://test/backups/schedule_test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x'
RECURRING '@daily'
WITH SCHEDULE OPTIONS on_previous_running = 'start';
schedule_id | label | status | first_run | schedule | backup_stmt
---------------------+-----------------+------------------------------------------------+------------------------------+----------+------------------------------------------------------------------------------------------------------------------------------
866226603264475137 | schedule_option | PAUSED: Waiting for initial backup to complete | NULL | @daily | BACKUP INTO LATEST IN 's3://test/backups/schedule_test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH detached
866226603270635521 | schedule_option | ACTIVE | 2023-05-18 14:56:20.39198+00 | @weekly | BACKUP INTO 's3://test/backups/schedule_test?AWS_ACCESS_KEY_ID=x&AWS_SECRET_ACCESS_KEY=x' WITH detached
(2 rows)
The schedule starts a new backup, even if the previous one is still running because the user specified option for on_previous_running = 'start'
. The incremental backup remains PAUSED
until the initial full backup is complete.
Because the FULL BACKUP
clause is not included, CockroachDB also schedules a full backup to run @daily
. This is the default cadence for incremental backups RECURRING
<= 1 hour.
View scheduled backup details
When a backup is created by a schedule, it is stored within a collection of backups in the given location. To view details for a backup created by a schedule, you can use the following:
SHOW BACKUPS IN collectionURI
statement to view a list of the full backup's subdirectories.SHOW BACKUP FROM subdirectory IN collectionURI
statement to view a list of the full and incremental backups that are stored in a specific full backup's subdirectory.- Use the Schedules page in the DB Console to view a list of created backup schedules and their individual details.
For more details, see SHOW BACKUP
.