Changefeed metrics

On this page Carat arrow pointing down

The Changefeed metrics let you monitor the performance of your changefeeds.

To view these graphs, select a cluster from the Clusters page, and click Metrics in the Monitoring section of the left side navigation. On the Metrics page, click the Changefeeds tab.

Time interval selection

The time interval selector at the top of each tab allows you to filter the view for a predefined or custom time interval. Use the navigation buttons to move to the previous, next, or current time interval. When you select a time interval, the same interval is selected for all charts on the Metrics page.

Changefeed Status

Short Name CockroachDB Metric Name Description Usage
Running
changefeed.running
Number of currently running changefeeds, including sinkless This metric tracks the total number of all running changefeeds.
Paused
jobs.changefeed.currently_paused
Number of changefeed jobs currently considered Paused Monitor and alert on this metric to safeguard against an inadvertent operational error of leaving a changefeed job in a paused state for an extended period of time. Changefeed jobs should not be paused for a long time because the protected timestamp prevents garbage collection.
Failures
changefeed.failures
Total number of changefeed jobs which have failed This metric tracks the permanent changefeed job failures that the jobs system will not try to restart. Any increase in this counter should be investigated. An alert on this metric is recommended.

Retryable Errors

Short Name CockroachDB Metric Name Description Usage
Errors
changefeed.error_retries
Total retryable errors encountered by all changefeeds This metric tracks transient changefeed errors. Alert on "too many" errors, such as 50 retries in 15 minutes. For example, during a rolling upgrade this counter will increase because the changefeed jobs will restart following node restarts. There is an exponential backoff, up to 10 minutes. But if there is no rolling upgrade in process or other cluster maintenance, and the error rate is high, investigate the changefeed job.

Emitted Messages

Short Name CockroachDB Metric Name Description Usage
Emitted messages
changefeed.emitted_messages
Messages emitted by all feeds This metric provides a useful context when assessing the state of changefeeds. This metric characterizes the rate of changes being streamed from the CockroachDB cluster.

Emitted Bytes

Short Name CockroachDB Metric Name Description Usage
Emitted bytes
changefeed.emitted_bytes
Bytes emitted by all feeds This metric provides a useful context when assessing the state of changefeeds. This metric characterizes the throughput bytes being streamed from the CockroachDB cluster.

Commit Latency

Short Name CockroachDB Metric Name Description Usage
P99, P90
changefeed.commit_latency
Event commit latency: a difference between event MVCC timestamp and the time it was acknowledged by the downstream sink. If the sink batches events, then the difference between the oldest event in the batch and acknowledgement is recorded; Excludes latency during backfill This metric provides useful context when assessing the state of changefeeds. This metric characterizes the end-to-end lag between a committed change and that change applied at the destination.

Oldest Protected Timestamp

Short Name CockroachDB Metric Name Description Usage
Protected Timestamp Age
jobs.changefeed.protected_age_sec
The age of the oldest PTS record protected by changefeed jobs changefeeds use protected timestamps to protect the data from being garbage collected. Ensure the protected timestamp age does not significantly exceed the GC TTL zone configuration. Alert on this metric if the protected timestamp age is greater than 3 times the GC TTL.

See also


Yes No
On this page

Yes No