System
Writing History: MVCC bulk ingestion and index backfills
Bulk ingestions are used to write large amounts of data with high throughput, such as imports, backup restoration, schema changes, and index backfills. These operations use a separate write path that bypasses transaction processing, instead ingesting data directly into the storage engine with highly amortized write and replication costs. However, because these operations bypass regular transaction processing, they have also been able to sidestep our normal mechanisms for preserving MVCC history.
Erik Grinaker
January 19, 2023
System
Writing History: MVCC range tombstones
This is part 3 of a 3-part blog series about how we’ve improved the way CockroachDB stores and modifies data in bulk (here is part 1 and here is part II). We went way down into the deepest layers of our storage system, then up to our SQL schema changes and their transaction timestamps - all without anybody noticing (or at least we hope!)
Erik Grinaker
January 19, 2023