The SQL dashboard in the DB Console lets you monitor the performance of your SQL queries.
To view this dashboard, access the DB Console, click Metrics in the left-hand navigation, and then select Dashboard > SQL.
Dashboard navigation
Use the Graph menu to display metrics for your entire cluster or for a specific node.
To the right of the Graph and Dashboard menus, a time interval selector allows you to filter the view for a predefined or custom time interval. Use the navigation buttons to move to the previous, next, or current time interval. When you select a time interval, the same interval is selected in the SQL Activity pages. However, if you select 10 or 30 minutes, the interval defaults to 1 hour in SQL Activity pages.
Hovering your mouse pointer over the graph title will display a tooltip with a description and the metrics used to create the graph.
When hovering on graphs, crosshair lines will appear at your mouse pointer. The series' values corresponding to the given time in the cross hairs are displayed in the legend under the graph. Hovering the mouse pointer on a given series displays the corresponding value near the mouse pointer and highlights the series line (graying out other series lines). Click anywhere within the graph to freeze the values in place. Click anywhere within the graph again to cause the values to change with your mouse movements once more.
In the legend, click on an individual series to isolate it on the graph. The other series will be hidden, while the hover will still work. Click the individual series again to make the other series visible. If there are many series, a scrollbar may appear on the right of the legend. This is to limit the size of the legend so that it does not get endlessly large, particularly on clusters with many nodes.
For monitoring CockroachDB, it is sufficient to use the Open SQL Sessions, SQL Byte Traffic, SQL Statements, Service Latency, and Transactions graphs.
The SQL dashboard displays the following time series graphs:
Open SQL Sessions
In the node view, the graph shows the number of connections open between the client and the selected node.
- To control the maximum number of non-superuser (
root
user or otheradmin
role) connections a gateway node can have open at one time, use theserver.max_connections_per_gateway
cluster setting. If a new non-superuser connection would exceed this limit, the error message"sorry, too many clients already"
is returned, along with error code53300
.
- To control the maximum number of non-superuser (
In the cluster view, the graph shows the total number of SQL client connections to all nodes combined, with lines for each node.
SQL Connection Rate
The SQL Connection Rate is an average of the number of connection attempts per second over an aggregation window.
In the node view, the graph shows the rate of SQL connection attempts between clients and the selected node.
In the cluster view, the graph shows the rate of SQL connection attempts to all nodes, with lines for each node.
Upgrades of SQL Transaction Isolation Level
In the node view, the graph shows the total number of times a SQL transaction was upgraded to a stronger isolation level on the selected node.
In the cluster view, the graph shows the total number of times a SQL transaction was upgraded to a stronger isolation level across all nodes.
If this metric is non-zero, then transactions at weaker isolation levels (such as READ COMMITTED
) are being upgraded to SERIALIZABLE
instead. To ensure that READ COMMITTED
transactions run as READ COMMITTED
, see Enable READ COMMITTED
isolation.
Open SQL Transactions
In the node view, the graph shows the total number of open SQL transactions on the node.
In the cluster view, the graph shows the total number of open SQL transactions across all nodes in the cluster.
See the Transactions page for more details on the transactions.
Active SQL Statements
In the node view, the graph shows the total number of SQL statements running on that node.
In the cluster view, the graph shows the total number of SQL statements running across all nodes in the cluster.
See the Statements page for more details on the cluster's SQL statements.
SQL Byte Traffic
The SQL Byte Traffic graph helps you correlate SQL query count to byte traffic, especially in bulk data inserts or analytic queries that return data in bulk.
In the node view, the graph shows the current byte throughput (bytes/second) between all the connected SQL clients and the node. There are lines for bytes in and bytes out.
In the cluster view, the graph shows the aggregate client throughput across all nodes. There are lines for bytes in and bytes out.
SQL Statements
In the node view, the graph shows the 10-second average of the number of
SELECT
/INSERT
/UPDATE
/DELETE
statements per second issued by SQL clients on the node.In the cluster view, the graph shows the sum of the per-node averages, that is, an aggregate estimation of the current statement load over the cluster, assuming the last 10 seconds of activity per node are representative of this load.
See the Statements page for more details on the cluster's SQL statements.
Metrics: sql.select.count
, sql.update.count
, sql.insert.count
, sql.delete.count
The following SQL statements update the INSERT
metric (sql.insert.count
):
INSERT ... ON CONFLICT DO UPDATE ...
: Even when theDO UPDATE
clause is actually executed, the root of the abstract syntax tree (AST) is used to increment the metric, rather than the actual execution details.
Data manipulation statements other than SELECT
/INSERT
/UPDATE
/DELETE
/UPSERT
update the sql.misc.count
metric, which is not displayed on this graph.
SQL Statement Errors
In the node view, the graph shows the 10-second average of the number of SQL statements issued to the node that returned a planning, runtime, or retry error.
In the cluster view, the graph shows the 10-second average of the number of SQL statements that returned a planning, runtime, or retry error across all nodes.
See the Statements page for more details on the cluster's SQL statements.
SQL Statement Contention
The statement contention metric is a counter that represents the number of statements that have experienced contention. If a statement experiences at least one contention "event" (i.e., the statement is forced to wait for another transaction), the counter is incremented at most once.
In the node view, the graph shows the total number of SQL statements that experienced contention on that node.
In the cluster view, the graph shows the total number of SQL statements that experienced contention across all nodes in the cluster.
See the Statements page for more details on the cluster's SQL statements.
Full Table/Index Scans
In the node view, the graph shows the total number of full table and index scans per second on that node. This is a non-negative rate.
In the cluster view, the graph shows the total number of full table and index scans per second across all nodes in the cluster. This is a non-negative rate.
You can identify the specific statements that result in full table scans by filtering for them on the Statements page. Examine these statements that result in full table scans and consider adding secondary indexes.
Active Flows for Distributed SQL Statements
In the node view, the graph shows the number of flows on that node contributing to the running distributed SQL statements.
In the cluster view, the graph shows the number of flows across all nodes in the cluster contributing to the running distributed SQL statements.
Connection Latency: 99th Percentile
Connection latency is calculated as the time in nanoseconds between when the cluster receives a connection request and establishes the connection to the client, including authentication.
In the node view, the graph shows the 99th percentile of connection latency for the node. Over the last minute this node established 99% of connections within this time, not including network latency between the node and the client.
In the cluster view, the graph shows the 99th percentile of connection latency across all nodes in the cluster. There are lines for each node in the cluster. Over the last minute the cluster established 99% of connections within this time, not including network latency between the node and the client.
Connection Latency: 90th Percentile
Connection latency is calculated as the time in nanoseconds between when the cluster receives a connection request and establishes the connection to the client, including authentication.
Service Latency: SQL, 99.99th percentile
Service latency is calculated as the time in nanoseconds between when the cluster receives a query and finishes executing the query. This time does not include returning results to the client. Service latency includes metrics only from DML (SELECT
,INSERT
, UPDATE
, and DELETE
) statements.
In the node view, the graph shows the 99.99th percentile of service latency for the node. Over the last minute this node executed 99.99% of queries within this time, not including network latency between the node and the client.
In the cluster view, the graph shows the 99.99th percentile of service latency across all nodes in the cluster. There are lines for each node in the cluster. Over the last minute the cluster executed 99.99% of queries within this time, not including network latency between the node and the client.
Service Latency: SQL, 99.9th percentile
Service latency is calculated as the time in nanoseconds between when the cluster receives a query and finishes executing the query. This time does not include returning results to the client. Service latency includes metrics only from DML (SELECT
,INSERT
, UPDATE
, and DELETE
) statements.
In the node view, the graph shows the 99.9th percentile of service latency for the node. Over the last minute this node executed 99.9% of queries within this time, not including network latency between the node and the client.
In the cluster view, the graph shows the 99.9th percentile of service latency across all nodes in the cluster. There are lines for each node in the cluster. Over the last minute the cluster executed 99.9% of queries within this time, not including network latency between the node and the client.
Service Latency: SQL, 99th percentile
Service latency is calculated as the time in nanoseconds between when the cluster receives a query and finishes executing the query. This time does not include returning results to the client. Service latency includes metrics only from DML (SELECT
,INSERT
, UPDATE
, and DELETE
) statements.
In the node view, the graph shows the 99th percentile of service latency for the node. Over the last minute this node executed 99% of queries within this time, not including network latency between the node and the client.
In the cluster view, the graph shows the 99th percentile of service latency across all nodes in the cluster. There are lines for each node in the cluster. Over the last minute the node executed 99% of queries within this time, not including network latency between the node and the client.
Service Latency: SQL, 90th percentile
Service latency is calculated as the time in nanoseconds between when the cluster receives a query and finishes executing the query. This time does not include returning results to the client. Service latency includes metrics only from DML (SELECT
,INSERT
, UPDATE
, and DELETE
) statements.
In the node view, the graph shows the 90th percentile of service latency for the node. Over the last minute this node executed 90% of queries within this time, not including network latency between the node and the client.
In the cluster view, the graph shows the 90th percentile of service latency across all nodes in the cluster. There are lines for each node in the cluster. Over the last minute the cluster executed 90% of queries within this time, not including network latency between the node and the client.
KV Execution Latency: 99th percentile
KV execution latency is calculated as the time in milliseconds between when the KV layer receives the request and delivers a response.
In the node view, the graph shows the 99th percentile of KV execution latency for the node. Over the last minute the node executed 99% of requests within this time.
In the cluster view, the graph shows the 99th percentile of KV execution latency for each node in the cluster. There are lines for each node in the cluster. Over the last minute the node executed 99% of requests within this time.
KV Execution Latency: 90th percentile
KV execution latency is calculated as the time in milliseconds between when the KV layer receives the request and delivers a response.
In the node view, the graph shows the 90th percentile of KV execution latency for the node. Over the last minute the node executed 90% of requests within this time.
In the cluster view, the graph shows the 90th percentile of KV execution latency for each node in the cluster. There are lines for each node in the cluster. Over the last minute the node executed 90% of requests within this time.
Transactions
In the node view, the graph shows the 10-second average of the number of opened (
Begin
), committed (Commits
), rolled back (Rollbacks
), and aborted (Aborts
) transactions per second issued by SQL clients on the node.In the cluster view, the graph shows the sum of the per-node averages, that is, an aggregate estimation of the current transactions load over the cluster, assuming the last 10 seconds of activity per node are representative of this load.
If the graph shows excessive aborts or rollbacks, it might indicate issues with the SQL statements. In that case, re-examine statements to lower contention.
See the Transactions page for more details on the transactions.
Transaction Restarts
In the node view, the graph shows the number of transactions restarted on that node broken down by the errors that caused the restart.
In the cluster view, the graph shows the number of transactions restarted across the cluster broken down by the errors that caused the restart.
See the Transaction Retry Error Reference for details on the errors that caused the transaction to restart.
Transaction Latency: 99th percentile
Transaction latency is calculated as the total time in nanoseconds a transaction took to complete.
In the node view, the graph shows the 99th percentile of transaction time over a 1 minute period for the node. Over the last minute the node completed 99% of transactions within this time.
In the cluster view, the graph shows the 99th percentile of transaction time over a 1 minute period for each node in the cluster. Over the last minute the node completed 99% of transactions within this time.
See the Transactions page for more details on the transactions.
Transaction Latency: 90th percentile
Transaction latency is calculated as the total time in nanoseconds a transaction took to complete.
In the node view, the graph shows the 90th percentile of transaction time over a 1 minute period for the node. Over the last minute the node completed 90% of transactions within this time.
In the cluster view, the graph shows the 90th percentile of transaction time over a 1 minute period for each node in the cluster. Over the last minute the node completed 90% of transactions within this time.
See the Transactions page for more details on the transactions.
SQL Memory
In the node view, the graph shows the current amount of memory in KiB allocated to the SQL layer on this node. This amount is what is compared against the node's
--max-sql-memory
flag.In the cluster view, the graph shows the current amount of memory in KiB allocated to the SQL layer on all nodes in the cluster. This amount is what is compared against the node's
--max-sql-memory
flag.
Expected values for a healthy cluster: This value should not exceed the --max-sql-memory
size. A healthy threshold is 75% of allocated --max-sql-memory
.
Schema Changes
In the node view, the graph shows the total number of DDL statements per second on the node.
In the cluster view, the graph shows the total number of DDL statements per second across all nodes in the cluster.
Statement Denials: Cluster Settings
Statement denials are statements that were denied due to a cluster setting with the following format:
{feature}.{statement_type}.enabled = FALSE
In the node view, the graph shows the total number of statements denied per second on this node.
In the cluster view, the graph shows the total number of statements denied per second across all nodes in the cluster.
Summary and events
Summary panel
A Summary panel of key metrics is displayed to the right of the timeseries graphs.
Metric | Description |
---|---|
Total Nodes | The total number of nodes in the cluster. Decommissioned nodes are not included in this count. |
Capacity Used | The storage capacity used as a percentage of usable capacity allocated across all nodes. |
Unavailable Ranges | The number of unavailable ranges in the cluster. A non-zero number indicates an unstable cluster. |
Queries per second | The total number of SELECT , UPDATE , INSERT , and DELETE queries executed per second across the cluster. |
P99 Latency | The 99th percentile of service latency. |
If you are testing your deployment locally with multiple CockroachDB nodes running on a single machine (this is not recommended in production), you must explicitly set the store size per node in order to display the correct capacity. Otherwise, the machine's actual disk capacity will be counted as a separate store for each node, thus inflating the computed capacity.
Events panel
Underneath the Summary panel, the Events panel lists the 5 most recent events logged for all nodes across the cluster. To list all events, click View all events.
The following types of events are listed:
- Database created
- Database dropped
- Table created
- Table dropped
- Table altered
- Index created
- Index dropped
- View created
- View dropped
- Schema change reversed
- Schema change finished
- Node joined
- Node decommissioned
- Node restarted
- Cluster setting changed