Publication date: November 29, 2021
Description
In CockroachDB 21.2.0, the optimizer could plan queries that use semi-joins against multi-region REGIONAL BY ROW
tables incorrectly. Uniqueness constraint checks against REGIONAL BY ROW
tables fall into this category, so inserts into UNIQUE
columns, including PRIMARY KEY
columns, in multi-region tables could fail to uphold the UNIQUE
constraint.
This issue only occurs against REGIONAL BY ROW
tables in CockroachDB 21.2.0.
Statement
This is resolved in CockroachDB by PR 73040, which disables the incorrect planning of locality-optimized semi-joins with ON-conditions.
The fix has been applied to 21.2.1.
This issue is tracked by public issue 73024.
Mitigation
Users of CockroachDB 21.2.0 are encouraged to audit their REGIONAL BY ROW
tables with UNIQUE
constraints for uniqueness and upgrade to 21.2.1. To prevent further unique violations from occurring while still on 21.2.0, you can disable locality optimized search by setting the cluster setting sql.defaults.locality_optimized_partitioned_index_scan.enabled
or session setting locality_optimized_partitioned_index_scan
to false
. This will likely hurt performance, however, so the best mitigation is to upgrade to 21.2.1 as soon as possible. These settings should be reenabled upon upgrading to 21.2.1.
To determine whether your REGIONAL BY ROW
tables have any uniqueness constraint violations, you can either attempt to create a new UNIQUE
index on each set of unique columns, or run a query for each UNIQUE constraint to determine whether the table contains any duplicate keys. For example:
SELECT * FROM t GROUP BY unique_col1 [, unique_col2 ...] HAVING count(*) > 1;
If this query returns any results, that indicates the table contains unique constraint violations. To fix the unique violations, you will need to manually update all but one of the duplicate keys with a different value.
Impact
REGIONAL BY ROW
tables could contain non-unique data in unique columns if insert into from multiple regions in 21.2.0. Additionally, REGIONAL BY ROW
tables might have returned incorrect results for queries that use a semi-join, such as:
SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t1.x = t2.x AND ...) AND ...;
or
SELECT * FROM t1 WHERE x IN (SELECT x FROM t2 WHERE ...) AND ...;
For queries such as these, CockroachDB might not have returned all results if the relevant data was spread across different regions.
Please reach out to the support team if you need more information or assistance.