I’m often asked by developers how they can squeeze the most performance out of their database. While there are other SQL performance best practices to consider, by far the easiest (and biggest) performance improvement comes from good use of indexes. I’ll break this down into two categories: (1), knowing what’s possible, and (2) knowing what you’re actually getting.
How to get the most out of your indexes
Consider the following query, where we want to know the last 10 messages a user has received:
SELECT title, body, sender, datetime
FROM messages
WHERE email = 'person@example.com'
ORDER BY datetime DESC
LIMIT 10;
How much work does the database have to do to retrieve the result? Without an index, the server will have to load every row in the table (this is called a table scan), throw away the ones that don’t match, then perform a sort operation, just to return the 10 messages we need, in order from newest to oldest. A distributed database can distribute the query and most of the work of the sort operation, but can’t reduce the workload without using an index.
A good index will:
Filter the data efficiently
Eliminate the need to sort the data
Answer your query with data found in the index
We might phrase these as rules for indexes:
Rule #1: Filter your query with your index
Rule #2: Sort your query with your index
Rule #3: Cover your query with your index
We’ll get to these shortly, but in order to see what’s happening, you need visibility into the database’s behavior, so let’s call that Rule #0: observe your query with EXPLAIN ANALYZE
.
If you prefer to learn from video you can check the video version of this blog right here:
Rule #0: Use statement plans to see how your database services your query
If you don’t know what the database is actually doing, you can’t be sure you’ve done things right. Every database provides this visibility in a slightly different way, but we’ll be looking at CockroachDB’s physical statement plan. This tells the database to run the query, break it down into multiple stages, and tells you what you need to know about each stage. To see this, run the query, prefixed by EXPLAIN ANALYZE
:
EXPLAIN ANALYZE SELECT title, body, sender, datetime
FROM messages
WHERE email = 'person@example.com'
ORDER BY datetime DESC
LIMIT 10;
The result will look something like this:
info
- - -
planning time: 683µs
execution time: 3s
distribution: local
vectorized: true
rows read from KV: 1,000,000 (226 MiB)
cumulative time spent in KV: 3s
maximum memory usage: 10 MiB
network usage: 0 B (0 messages)
• top-k
│ nodes: n1
│ actual row count: 10
│ estimated max memory allocated: 10 KiB
│ estimated max sql temp disk usage: 0 B
│ estimated row count: 2
│ order: -datetime
│ k: 10
│
└── • filter
│ nodes: n1
│ actual row count: 1,000
│ estimated row count: 2
│ filter: email = 'person@example.com'
│
└── • scan
nodes: n1
actual row count: 1,000,000
KV time: 3s
KV contention time: 0µs
KV rows read: 1,000,000
KV bytes read: 226 MiB
estimated max memory allocated: 10 MiB
estimated row count: 1,000,000 (100% of the table; stats collected 6 seconds ago)
table: messages@primary
spans: FULL SCAN
(35 rows)
Time: 3.106s total (execution 2.991s / network 0.115s)
A statement plan is read from bottom to top. Ignore the Time
field at the bottom – that can vary a lot depending on a number of factors out of your control, such as whether or not your cluster is under load, what type of load it is, how much of your results are in memory already, etc. So don’t trust it.
What you want to look at is everything else. First, note that there are 3 stages of this query. The scan
stage shows that the table had to be scanned, and all 1 million rows loaded (see actual row count
). You can see that it’s looking at the table from the row that says table: messages@messages_pkey
. Those 1,000,000 rows are passed to the filter
stage, which trims it down to 1,000 rows. Those, in turn, are passed to the top-k
stage, which sorts the 1,000 rows and returns the 10 we actually wanted.
There’s a better way.
Rule #1: Filter your query with your index to avoid a full table scan
The only way to avoid that table scan is to build an index. The simplest one that helps the most is a single-column index on the ‘email’ field. This index is ordered by email, enabling the server to isolate only the user’s messages, and ignore the other 999,000 in the table.
CREATE INDEX email_idx ON messages (email);
Now that the index is built, let’s look at the statement plan again.
info
- - -
planning time: 557µs
execution time: 121ms
distribution: local
vectorized: true
rows read from KV: 2,000 (280 KiB)
cumulative time spent in KV: 120ms
maximum memory usage: 1.1 MiB
network usage: 0 B (0 messages)
• top-k
│ nodes: n1
│ actual row count: 10
│ estimated max memory allocated: 10 KiB
│ estimated max sql temp disk usage: 0 B
│ estimated row count: 2
│ order: -datetime
│ k: 10
│
└── • index join
│ nodes: n1
│ actual row count: 1,000
│ KV time: 117ms
│ KV contention time: 0µs
│ KV rows read: 1,000
│ KV bytes read: 211 KiB
│ estimated max memory allocated: 960 KiB
│ estimated max sql temp disk usage: 0 B
│ estimated row count: 2
│ table: messages@primary
│
└── • scan
nodes: n1
actual row count: 1,000
KV time: 3ms
KV contention time: 0µs
KV rows read: 1,000
KV bytes read: 69 KiB
estimated max memory allocated: 120 KiB
estimated row count: 2 (<0.01% of the table; stats collected 2 minutes ago)
table: messages@email_idx
spans: [/'person@example.com' - /'person@example.com']
(41 rows)
Time: 228ms total (execution 122ms / network 107ms)
It still has 3 stages, but when you take a closer look, the scan
stage only touches 1,000 rows, rather than 1 million. You can see that it’s using an index on the line that says table: messages@email_idx
. This means that CockroachDB is looking at the messages
table’s email_idx
index that we just created. And you can see from the actual row count
that it’s only touching 1,000 rows.
The second stage had been filter
, but is now index join
. That’s because the database starts by scanning the relevant portion of the index, and quickly finds that it needs to sort by datetime
(a field that isn’t in the index), so it needs to go to the table and look up each of those rows identified by the index as the ones with the correct email.
Already, this is a big improvement: instead of scanning through 1 million rows and throwing away 99.9% of them, the database is now looking at 1,000 index entries, then looking up only the 1,000 rows it needs. The final stage, top-k
(where the database performs the sort operation), is exactly the same as before. To eliminate that stage, we’re going to need an even better index.
Rule #2: Sort your query with your index to speed up the query
The index we created is ordered by email
, but doesn’t have any information about the timestamp
we need to sort by– for that information, the database has to pull the relevant rows from the table. A better index would also include information on the timestamp, and ideally would be sorted by timestamp. Since two columns are involved, we’ll want to use a composite index on both fields – first email, then timestamp
.
CREATE INDEX email_datetime_idx ON messages (email, datetime);
While we’re doing this, we should drop the old index – we won’t need it anymore.
DROP INDEX email_idx;
With that done, we can now see how our new index performs. Let’s run the query again, and look at the statement plan:
EXPLAIN ANALYZE SELECT title, body, sender, datetime
FROM messages
WHERE email = 'person@example.com'
ORDER BY datetime DESC
LIMIT 10;
info
- - -
planning time: 796µs
execution time: 6ms
distribution: local
vectorized: true
rows read from KV: 20 (2.7 KiB)
cumulative time spent in KV: 6ms
maximum memory usage: 60 KiB
network usage: 0 B (0 messages)
• index join
│ nodes: n1
│ actual row count: 10
│ KV time: 3ms
│ KV contention time: 0µs
│ KV rows read: 10
│ KV bytes read: 1.9 KiB
│ estimated max memory allocated: 30 KiB
│ estimated max sql temp disk usage: 0 B
│ estimated row count: 10
│ table: messages@primary
│
└── • revscan
nodes: n1
actual row count: 10
KV time: 2ms
KV contention time: 0µs
KV rows read: 10
KV bytes read: 780 B
estimated max memory allocated: 20 KiB
estimated row count: 10 (<0.01% of the table; stats collected 43 seconds ago)
table: messages@email_datetime_idx
spans: [/'person@example.com' - /'person@example.com']
limit: 10
(33 rows)
Time: 92ms total (execution 8ms / network 84ms)
The first thing you should note is that there are now only two stages. The bottom one (revscan
) tells us that our cluster is scanning a region of the index in reverse order, and the table
row shows us it’s using the index we just created. You can see that the actual row count
is now only 10. It was able to go to the index and find only the 10 rows it needs for this query. The final stage is an index join, where it looks up only the 10 rows it needs.
Again, this is a huge improvement. Our first index, on just email, took us from scanning 1 million table rows that had to be filtered by email and then sorted to scanning 1,000 index entries, looking them up in the table individually, and then sorting them. So a scan of 1 million rows became a scan of 1,000 index entries, plus 1,000 lookup operations (the join), eliminating the need to look at the other 99.9% of the table. Now we’re only scanning 10 index entries and looking up 10 rows from the table. So we’re eliminating 99.999% of the table rows we looked at before, and we don’t have to sort anything (because the index is already sorted). Now let’s take it to the next level.
Rule #3: Cover your query with your index to avoid the table entirely
Take another look at the original query:
SELECT title, body, sender, datetime
FROM messages
WHERE email = 'person@example.com'
ORDER BY datetime DESC
LIMIT 10;
It’s asking for the data from four columns, as you can see in the SELECT clause: title, body, sender, and datetime. The indexes we’ve looked at out, however, are on email and datetime. These are both useful for filtering and sorting, but the datetime is only one of the fields we need to display, so we still have to go to the table for the rest of the data in the row.
However, you can see that if you only needed the datetime, the behavior is even simpler:
EXPLAIN ANALYZE SELECT datetime FROM messages WHERE email = 'person@example.com' ORDER BY datetime DESC LIMIT 10
;
info
- - -
planning time: 853µs
execution time: 3ms
distribution: local
vectorized: true
rows read from KV: 10 (780 B)
cumulative time spent in KV: 2ms
maximum memory usage: 20 KiB
network usage: 0 B (0 messages)
• revscan
nodes: n1
actual row count: 10
KV time: 2ms
KV contention time: 0µs
KV rows read: 10
KV bytes read: 780 B
estimated max memory allocated: 20 KiB
estimated row count: 2 (<0.01% of the table; stats collected 1 minute ago)
table: messages@email_datetime_idx
spans: [/'person@example.com' - /'person@example.com']
limit: 10
(21 rows)
Time: 85ms total (execution 4ms / network 81ms)
Now there’s only one stage: revscan
. The database found the 10 index entries it needed, and because those entries contained all the information needed to answer the query, it was able to do that without going to the table at all. There was no need! This is called a covered query, and it’s the key to really optimizing your reads by eliminating the need to even touch the table to answer a query.
But, of course, we were only able to perform a covered query because we didn’t ask for most of the data we needed, which isn’t very helpful.
One solution would be to build an index with more columns added to the composite index, but that’s not the ideal approach. The ideal approach is to store the data we need with the index keys we need to filter and sort the data.
CREATE INDEX email_datetime_storing_title_body_sender_idx
ON
messages (email, datetime)
STORING (title, body, sender);
… and drop the old index
DROP INDEX email_datetime_idx;
… and take a look at what the database does to answer the original query:
EXPLAIN ANALYZE SELECT title, body, sender, datetime
FROM messages
WHERE email = 'person@example.com'
ORDER BY datetime DESC
LIMIT 10
;
info
- - -
planning time: 2ms
execution time: 6ms
distribution: local
vectorized: true
rows read from KV: 10 (2.1 KiB)
cumulative time spent in KV: 5ms
maximum memory usage: 20 KiB
network usage: 0 B (0 messages)
• revscan
nodes: n1
actual row count: 10
KV time: 5ms
KV contention time: 0µs
KV rows read: 10
KV bytes read: 2.1 KiB
estimated max memory allocated: 20 KiB
estimated row count: 2 (<0.01% of the table; stats collected 3 minutes ago)
table: messages@email_datetime_storing_title_body_sender_idx
spans: [/'person@example.com' - /'person@example.com']
limit: 10
(21 rows)
Time: 88ms total (execution 9ms / network 79ms)
This is as streamlined as you can get with an index. CockroachDB got all of the information it needed by only finding the relevant 10 rows (which were located together, in the order needed). There was no point in going to the table because all four columns are in the index: the datetime is part of the index key, and the other three columns are stored with the index entry.
So now we’ve eliminated not only the need to scan the table or sort rows, but also the need to look up the table data separately. That’s the power of a good index.
Caveats to these index best practices
There’s a lot we could have said about indexes but this is a blog post, not a novel, so we kept things simple. (We also made an entire course about query performance that you can take for free.) Nevertheless, here are three things to keep in mind.
First, indexes have to be kept in synch with the table rows they point to. That means that writes to a row have to go to the table and to the index. And because you don’t want the index to be eventually consistent, this means you need a transaction, along with its associated overhead. So don’t build too many indexes on a table, because you’ll have to write to potentially all of them when writing data.
Second, if you index a large field (such as the “body” field in the messages, above), you have a large index, so try to avoid that.
And third, always use your EXPLAIN plans to check that things are working the way you expect.
If you see something unexpected, ask for help! You can ask questions on the CockroachDB community slack. Our engineers keep an eye on it and are happy to jump in! And if you’re looking for SQL performance best practices specifically related to CockroachDB check out this documentation.