Monitor CockroachDB Self-Hosted Clusters with Datadog

On this page Carat arrow pointing down

Datadog is a monitoring and security platform for cloud applications. The integration of your CockroachDB self-hosted cluster with Datadog enables data collection and alerting on selected CockroachDB metrics using the Datadog platform.

Tip:

This tutorial explores the CockroachDB self-hosted integration with Datadog. For the CockroachDB Cloud integration with Datadog, refer to Monitor CockroachDB Dedicated with Datadog instead of this page.

The integration of your CockroachDB self-hosted cluster with Datadog is powered by the Datadog Agent, and supported by Datadog directly:

In this tutorial, you will enable the CockroachDB integration in Datadog, configure logging and alerting, and visualize data.

Before you begin

Before you can follow the steps presented in this tutorial, you must have:

Step 1. Enable CockroachDB integration

To enable the CockroachDB check for your installed Datadog Agent, navigate to the Integrations page and find CockroachDB in the list of available integrations. Hover over the icon and click + Install.

CockroachDB integration for Datadog installation

Note that you must restart the Datadog Agent for the change to take effect. CockroachDB will then be listed as an installed integration.

Step 2. Configure Datadog Agent for CockroachDB

Follow the steps in the Datadog documentation to access and edit your cockroachdb.d/conf.yaml file in your Datadog conf.d/ directory. For all available options, see the sample cockroachdb.d/conf.yaml.

After making the following changes, restart the Datadog Agent to apply them.

Enable metrics collection

Uncomment the following line in cockroachdb.d/conf.yaml:

- prometheus_url: http://localhost:8080/_status/vars

This enables metrics collection via our Prometheus endpoint.

Configure security certificates

Uncomment the lines that begin with tls_private_key and tls_ca_cert. These should specify the full file paths to your CA key and certificate, respectively.

For example, if you used cockroach cert to secure your cluster, the paths would look something like:

tls_private_key: /custom/dir/path/client.root.key
tls_ca_cert: /custom/dir/path/ca.crt

Enable log collection

Optionally enable Datadog to collect CockroachDB logs by adding the following block to cockroachdb.d/conf.yaml:

icon/buttons/copy
logs:
 - type: file
   path: /cockroach-data/logs/cockroach.log
   source: cockroachdb
   service: cockroachdb
   log_processing_rules:
   - type: multi_line
     name: new_log_start_with_status_and_date
     pattern: '[A-Z]\d{6}\s\d+\:\d+\:\d+\.\d+'

The path value specifies the default CockroachDB log file and location.

Note:

You can configure both the CockroachDB logging directory and log files.

Log collection is disabled by default in the Datadog Agent. Follow the steps in the Datadog documentation to activate log collection.

Step 3. Verify Datadog connection to CockroachDB

To apply your configuration changes, ensure that you have restarted the Datadog Agent.

Run the Datadog Agent status subcommand for your platform to verify that the CockroachDB integration is properly connected.

Your output should contain a block like this:

cockroachdb (1.6.0)
-------------------
  Instance ID: cockroachdb:42170ecfbfb171c5 [OK]
  Configuration Source: file:/opt/datadog-agent/etc/conf.d/cockroachdb.d/conf.yaml
  Total Runs: 4
  Metric Samples: Last Run: 548, Total: 2,154
  Events: Last Run: 0, Total: 0
  Service Checks: Last Run: 1, Total: 4
  Average Execution Time : 143ms
  Last Execution Date : 2021-07-14 12:34:56 EDT / 2021-07-14 16:34:56 UTC (1626280496000)
  Last Successful Execution Date : 2021-07-14 12:34:56 EDT / 2021-07-14 16:34:56 UTC (1626280496000)
  metadata:
    version.major: 21
    version.minor: 2
    version.patch: 0
    version.raw: v24.2.0-alpha.00000000-1724-gc5c74249f7
    version.release: alpha.0
    version.scheme: semver

Step 4. View CockroachDB dashboards on Datadog

Open your Datadog Dashboard List and click on CockroachDB Overview:

CockroachDB Overview dashboard in Datadog Dashboard List

This sample dashboard presents metrics on cluster availability, query performance, and resource usage:

CockroachDB Overview dashboard for Datadog

Note:

If you wish to customize your CockroachDB dashboard, it's recommended that you clone the default CockroachDB Overview dashboard before adding and removing widgets. If you leave the default dashboard intact, Datadog will update it when new versions of the integration's dashboard are released.

Step 5. Run a sample workload

To test the dashboard functionality, use cockroach workload to run a sample workload on the cluster.

Initialize the workload for MovR, a fictional vehicle-sharing company:

icon/buttons/copy
cockroach workload init movr 'postgresql://root@localhost:26257?sslcert=certs%2Fclient.root.crt&sslkey=certs%2Fclient.root.key&sslmode=verify-full&sslrootcert=certs%2Fca.crt'

Run the MovR workload for 5 minutes:

icon/buttons/copy
cockroach workload run movr --duration=5m 'postgresql://root@localhost:26257?sslcert=certs%2Fclient.root.crt&sslkey=certs%2Fclient.root.key&sslmode=verify-full&sslrootcert=certs%2Fca.crt'

The query metrics will appear on the dashboard:

CockroachDB Overview dashboard for Datadog with SQL metrics

Step 6. Add monitoring and alerting

Follow the steps in the Datadog documentation to create a new Monitor.

Select Threshold Alert as the detection method. You can use this option to configure an alert that is sent when a supported metric reaches a given threshold. For descriptions of some useful CockroachDB alerts, see Monitoring and Alerting.

The example alert below will trigger when a node has less than 15% of storage capacity remaining:

CockroachDB Threshold Alert in Datadog

  • cockroachdb.capacity.available is divided by cockroachdb.capacity.total to determine the fraction of available capacity on the node's store (the directory on each node where CockroachDB reads and writes its data).
  • The alert threshold is set to 0.15.

The timeseries graph at the top of the page indicates the configured metric and threshold:

CockroachDB Threshold Alert in Datadog

Step 7. Disable DB Console's local storage of metrics (optional)

If you rely on external tools such as Datadog for storing and visualizing your cluster's time-series metrics, Cockroach Labs recommends that you disable the DB Console's storage of time-series metrics.

When storage of time-series metrics is disabled, the cluster continues to expose its metrics via the Prometheus endpoint. The DB Console stops storing new time-series cluster metrics and eventually deletes historical data. The Metrics dashboards in the DB Console are still available, but their visualizations are blank. This is because the dashboards rely on data that is no longer available. You can create queries, visualizations, and alerts in Datadog based on the data it is collecting from your cluster's Prometheus endpoint.

Known limitations

  • The integration of your CockroachDB self-hosted cluster with Datadog only supports displaying cluster-wide averages of reported metrics. Filtering by a specific node is unsupported.

See also


Yes No
On this page

Yes No