Blog
Engineering
Parallel Commits: An atomic commit protocol for globally distributed transactions
Distributed ACID transactions form the beating heart of CockroachDB. They allow users to manipulate any and all of their data transactionally, no matter where it physically resides. Distributed transactions are so important to CockroachDB’s goal to “Make Data Easy” that we spend a lot of time thinking about how to make them as fast as possible. Specifically, CockroachDB specializes in globally distributed deployments, so we put a lot of effort into optimizing CockroachDB’s transaction protocol for clusters with high inter-node latencies.
Nathan VanBenschoten
November 7, 2019
Engineering
SQLsmith: Randomized SQL testing in CockroachDB
Randomized testing is a way for programmers to automate the discovery of interesting test cases that would be difficult or overly time consuming to come up with by hand. CockroachDB uses randomized testing in many parts of its code. I previously wrote about generating random, valid SQL. Since then we’ve added an improved SQL generator to our suite called SQLsmith, inspired by a C compiler tester called Csmith. It improves on the previous tool by generating type and column-aware SQL that usually passes semantic checking and tests the execution logic of the database. It has found over 40 new bugs in just a few months that the previous tool was unable to produce. Here I’ll discuss the evolution of our randomized SQL testing, how the new SQLsmith tool works, and some thoughts on the future of targeted randomized testing.
Matt Jibson
June 27, 2019
Engineering
High availability without giving up consistency
If you’re reading this, you’re surely familiar with the arguments for high availability: services are only useful when they’re online. Unavailable services not only lose money, but also deteriorate your credibility in customers’ eyes. This could lead to immeasurable costs to your company in the future. Given that CockroachDB got its name because of its ability to survive failures, we thought we would cover some architectural considerations when building high availability services on top of Cockroach.
Sean Loiselle
August 23, 2018
Engineering
Kubernetes: The state of stateful apps
Over the past year, Kubernetes––also known as K8s––has become a dominant topic of conversation in the infrastructure world. Given its pedigree of literally working at Google-scale, it makes sense that people want to bring that kind of power to their DevOps stories; container orchestration turns many tedious and complex tasks into something as simple as a declarative config file.
Sean Loiselle
May 1, 2018
Engineering
Data migration made easy: Bulk ingest from CSV
We think CockroachDB is a great database for many people, and want them to try us out. Not just for new applications, but for existing, large applications as well. The first problem that users with an existing database will hit when trying us out for the first time is getting their data into CockroachDB. For the 1.1 release, we built a new feature that performs high-speed, bulk data import. It works by transforming CSV files into our backup/restore format, then is able to quickly ingest the results.
Matt Jibson
October 26, 2017
Engineering
Distributed SQL (NewSQL) made easy: How CockroachDB automates operations
A modern distributed database should do more than just split data amongst a number of servers; it should correctly manage partitions (or shards). Moreso, it should automatically detect failures, fix itself without any operator intervention, and completely abstract this management from the end user. This post is the first in a series on how CockroachDB handles its data and discusses the mechanisms it uses to rebalance and repair. These systems make managing a CockroachDB cluster significantly easier than managing other databases.
Bram Gruneir
October 5, 2017
Engineering
CockroachDB on DC/OS: Resilient and hassle-free operations for global services
CockroachDB makes data easier to manage by providing a strongly-consistent, highly-scalable, SQL interface that you can trust to be there when you need it. We’ve designed it to be a truly cloud-native, distributed SQL database that’s easy to operate in any environment you throw at it. One such computing environment that has grown in popularity over the previous few years is Mesosphere’s DC/OS, a datacenter operating system built on top of Apache Mesos. DC/OS is an orchestration system for deploying and managing distributed applications across a cluster of machines as if they were a single pool of resources. DC/OS has both an open source and an enterprise version that gives you the ability to elastically scale your infrastructure on prem or in the cloud. It provides scheduling, resource allocation, service discovery, automatic recovery from failure, load balancing, and more, all with the goal of making it easier to manage your applications.
Alex Robinson
September 28, 2017
Engineering
Real transactions are serializable
Most databases offer a choice of several transaction isolation levels, offering a tradeoff between correctness and performance. However, that performance comes at a price, as developers must study their transactional interactions carefully or risk introducing subtle bugs. CockroachDB provides strong (“SERIALIZABLE”) isolation by default to ensure that your application always sees the data it expects. In this post I'll explain what this means and how insufficient isolation impacts real-world applications.
Ben Darnell
September 21, 2017
Engineering
Efficient documentation using SQL grammar diagrams
As CockroachDB approaches beta, user documentation has become increasingly important, and one of the meatiest requirements is documentation of our SQL implementation. For inspiration, I researched how other databases have documented SQL. The most effective example I found was SQLite’s grammar diagrams.
Matt Jibson
March 16, 2016