Publication date: July 18, 2022
Description
The problem occurs on clusters which were running v21.1 or an earlier version, and are now running versions v22.1.0 to v22.1.2. Adding a column to a table which references a sequence, or creating a table with columns referencing sequences, adds an incomplete back-reference to the sequence metadata. Upgrading to versions v22.1.0 or v22.1.2 then either fails outright, or succeeds but causes the sequences and the tables to become inoperable. The error message in those cases matches the regular expression:
depended-on-by relation .* does not have a column with ID 0
This problem can conclusively be diagnosed by running the following query:
WITH
tables
AS (
SELECT
*
FROM
(
SELECT
id,
crdb_internal.pb_to_json(
'cockroach.sql.sqlbase.Descriptor',
descriptor
)->'table'
AS tab
FROM
system.descriptor
)
WHERE
tab IS NOT NULL
),
columns_using_sequence_ids
AS (
SELECT
table_id,
(c->'id')::INT8 AS column_id,
json_array_elements(c->'usesSequenceIds')::INT8 AS seq_id
FROM
(
SELECT
id AS table_id, c
FROM
tables,
ROWS FROM (json_array_elements(tab->'columns')) AS t (c)
)
WHERE
(c->'usesSequenceIds') IS NOT NULL
AND json_array_length(c->'usesSequenceIds') = 1
),
sequences_with_missing_depended_on_by
AS (
SELECT
seq_id, (dep->>'id')::INT8 AS table_id, ord, dep
FROM
(
SELECT
id AS seq_id, dep, ord - 1 AS ord
FROM
tables,
ROWS FROM (
json_array_elements(tab->'dependedOnBy')
) WITH ORDINALITY
AS t (dep, ord)
WHERE
(tab->'sequenceOpts') IS NOT NULL
AND EXISTS(
SELECT
*
FROM
ROWS FROM (
json_array_elements(tab->'dependedOnBy')
)
AS t (dep)
WHERE
dep->'columnIds' @> '[0]'::JSONB
)
)
WHERE
(dep->>'byId')::BOOL
)
SELECT
count(*)
FROM
sequences_with_missing_depended_on_by;
If you get back a non-zero result, your cluster has the problem described in this technical advisory. If you get back zero, your cluster does not have this problem.
Statement
This is resolved in CockroachDB by PR #82833 which relaxes the validation checks which detect descriptor corruption.
The fix has been applied to maintenance release v22.1.3 of CockroachDB.
This public issue is tracked by #82576.
Mitigation
Users of CockroachDB v22.1.0 to v22.1.2 are encouraged to upgrade to v22.1.3 or later.
Alternatively, users can run the following repair query:
WITH tables AS (
SELECT *
FROM (
SELECT id,
crdb_internal.pb_to_json(
'cockroach.sql.sqlbase.Descriptor',
descriptor
)->'table' AS tab
FROM system.descriptor
)
WHERE tab IS NOT NULL
),
columns_using_sequence_ids AS (
SELECT table_id,
(c->'id')::INT8 AS column_id,
json_array_elements(c->'usesSequenceIds')::INT8 AS seq_id
FROM (
SELECT id AS table_id, c
FROM tables,
ROWS FROM (json_array_elements(tab->'columns')) AS t
(c)
)
WHERE (c->'usesSequenceIds') IS NOT NULL
AND json_array_length(c->'usesSequenceIds') = 1
),
sequences_with_missing_depended_on_by AS (
SELECT seq_id, (dep->>'id')::INT8 AS table_id, ord, dep
FROM (
SELECT id AS seq_id, dep, ord - 1 AS ord
FROM tables,
ROWS FROM (
json_array_elements(tab->'dependedOnBy')
) WITH ORDINALITY AS t (dep, ord)
WHERE (tab->'sequenceOpts') IS NOT NULL
AND EXISTS(
SELECT *
FROM ROWS FROM (
json_array_elements(
tab->'dependedOnBy'
)
) AS t (dep)
WHERE dep->'columnIds' @> '[0]'::JSONB
)
)
WHERE (dep->>'byId')::BOOL
),
depended_on_by_entries AS (
SELECT s.seq_id, ord, json_agg(t.column_id) AS column_ids
FROM columns_using_sequence_ids AS t
JOIN sequences_with_missing_depended_on_by AS s ON t.table_id
= s.table_id
AND t.seq_id
= s.seq_id
GROUP BY s.seq_id, s.table_id, ord
),
updated_entries AS (
SELECT seq_id, ord, json_set(d, ARRAY['columnIds'], column_ids) AS d
FROM depended_on_by_entries
JOIN tables ON seq_id = tables.id
JOIN ROWS FROM (
json_array_elements(tab->'dependedOnBy')
) WITH ORDINALITY AS jae (d, idx) ON ord = idx - 1
),
depended_on_by_arrs AS (
SELECT seq_id, json_agg(d ORDER BY ord ASC) AS depended_on_by
FROM updated_entries
GROUP BY seq_id
)
SELECT crdb_internal.unsafe_upsert_descriptor(
seq_id,
crdb_internal.json_to_pb(
'cockroach.sql.sqlbase.Descriptor',
json_build_object(
'table',
json_remove_path(
json_set(
json_set(tab, ARRAY['dependedOnBy'], depended_on_by),
ARRAY['version'],
((tab->>'version')::INT8 + 1)::STRING::JSONB
),
ARRAY['modificationTime']
)
)
),
true
)
FROM depended_on_by_arrs JOIN tables ON id = seq_id;
If the query succeeds, that means the repair has succeeded. You can re-run the diagnostic query to confirm this. We encourage users to back up their clusters before undertaking such a repair.
Impact
Clusters which have been upgraded from v21.1 to v22.1, and which are currently running versions v22.1.0 to v22.1.2, may be affected.
Questions about any technical alert can be directed to our support team.