Blog
Engineering
How we built scalable spatial data and spatial indexing in CockroachDB
Support for spatial data and spatial indexing is one of the most requested features in the history of CockroachDB. The first issue requesting spatial data in CockroachDB was opened in October 2017, and closed on November 12, 2020 with the release of spatial data storage in CockroachDB 20.2.
Sumeer Bhola
December 9, 2020
Engineering
Faster bulk-data loading in CockroachDB
Last year the BulkIO team at Cockroach Labs replaced the implementation of our IMPORT bulk-loading feature with a simpler and faster data ingestion pipeline. In most of our tests, it looked like a major improvement: the release notes for CockroachDB v19.2 touted "4x faster" IMPORT. Many a 🎉 reaction was clicked, and the team moved on to new projects. But over the following months, it became clear we had celebrated too soon: we started to get reports of some IMPORTs that, instead of being faster, were much slower or even getting stuck. Armed with a test that could reproduce such a case, we started to dig.
Bilal Akhtar
October 13, 2020
Engineering
Cloud-native Java-persistence layer using CockroachDB and Hibernate
This blog is written by guest authors Robin de Silva Jayasinghe, Thomas Pötzsch, and Joachim Mathes. Robin de Silva Jayasinghe, is a Sr. software engineer based in Germany working at synyx GmbH & Co. KG. synyx is an agile software provider in Karlsruhe, Germany that works together with different companies to find the best possible IT-solutions for their challenges. Thomas and Joachim are working as software and systems engineers at Contargo, one of the leading container hinterland logistics in Europe.
Robin de Silva Jayasinghe
September 18, 2020
Engineering
Introducing Pebble: A RocksDB-inspired key-value store written in Go
Since its inception, CockroachDB has relied on RocksDB as its key-value storage engine. The choice of RocksDB has served us well. RocksDB is battle tested, highly performant, and comes with a rich feature set. We’re big fans of RocksDB and we frequently sing its praises when asked why we didn’t choose another storage engine. Today we’re introducing Pebble.
Peter Mattis
September 15, 2020
Engineering
Alter column types without taking tables offline
There are many reasons you might want to alter the schema of your database but in many databases, this process typically requires downtime. In CockroachDB, we have supported online schema changes since our first stable release, and in v20.1, we added the ability to alter primary keys while in production without downtime.
Richard Cai
August 20, 2020
Engineering
What's new in CockroachDB’s cost-based query optimizer
In 2018, CockroachDB implemented a cost-based query optimizer from scratch, which has been steadily improved in each release. The query optimizer is the part of the system that understands the semantics of SQL queries and decides how to execute them; execution plans can vary wildly in terms of execution time, so choosing a good plan is important. In this post we go over some of the optimizer-related improvements in CockroachDB v20.1.
Radu Berinde
July 16, 2020
Engineering
Improving unordered distinct efficiency in the vectorized SQL engine
For the past four months, I’ve worked as a backend engineering intern with the incredibly talented engineers of the Cockroach Labs SQL Execution team to unlock the full potential of the vectorized engine ahead of the CockroachDB 20.1 release. During this time, I focused on improving the algorithm behind some of CockroachDB’s SQL operators, to reduce memory usage and speed up query execution. In this blog post, I will go over the improvements that I made to the algorithm behind the unordered distinct operator, as well as new algorithms that address some existing inefficiencies.
Archer Zhang
July 9, 2020
Engineering
Disk spilling in a vectorized execution engine
Late last year, we shipped v1 of our vectorized execution engine. It enables column-based query execution and speeds up complex joins and aggregations, improving analytical capabilities in CockroachDB (which is first and foremost optimized for OLTP workloads). v1 of the engine didn’t support disk spilling, which meant it couldn’t execute certain memory-intensive queries if there was not enough memory available. Starting in CockroachDB v20.1, these queries fall back to disk (also known as “spilling” to disk).
Alfonso Subiotto Marques
June 30, 2020
Engineering
When and why to use SELECT FOR UPDATE in CockroachDB
We didn’t implement SELECT FOR UPDATE to ensure consistency. Unlike Amazon Aurora, CockroachDB already guarantees serializable isolation, the highest isolation level provided by the ANSI SQL standard. Contention happens. And application developers shouldn’t need to risk the integrity of their data when transactions contend. To combat this, CockroachDB must occasionally return errors prompting applications to retry transactions that would risk anomalies, such as write skews. This means that just like with PostgreSQL in serializable isolation, developers need to implement retry-loops for contended transactions. For developers accustomed to relational databases with lower isolation levels, this can be an unfamiliar pattern.
John Kendall
June 22, 2020