blog-banner

The guy who wrote the book on CockroachDB

Last edited on May 10, 2022

0 minute read

    Everyone here at Cockroach Labs is thrilled about the publication of our very own O’Reilly tome, CockroachDB: The Definitive Guide. Seriously delighted — as in, just-found-out-the-rest-of-today’s-meetings-are-all-canceled level of happiness. We are beyond proud to have our own hefty tech book with an iconic animal on the cover. (Or, even better, iconic insect on the cover since the fine folks at O’Reilly Media agreed to let us have a cockroach on ours). Most of all, we are delighted to have the complete workings of CockroachDB collected in a single, thorough and hands-on resource to help make distributed SQL accessible to even more users.

    This is thanks in large part due to Guy Harrison, who is quite literally The Guy Who Wrote The Book On CockroachDB. Guy co-authored the Definitive Guide with Cockroach Labs co-founder and chief architect, Ben Darnell, and our head of education, Jesse Seldess. “Guy’s deep existing database knowledge paired with his diligence in learning Cockroach just as deeply contributed immeasurably to the scope and thoroughness of the book,” Ben commented.

    Just before CockroachDB: The Definitive Guide’s publication we sat down with Guy — a self-described “database nerd” — to talk about his multi-decade career working with databases, and what it was like for him to learn and build with CockroachDB.

    Guy, you just finished writing “CockroachDB The Definitive Guide” with Ben Darnell and Jesse Seldess. Can you tell us about your experience in the database world?

    Thanks, Michelle!

    So, I’m definitely at the stage of my career where people use words like “veteran” or “old-timer”! I started working with databases in the mid-eighties, using pre-relational technologies like Adabas and dBase. In the late-80s, I started working with Oracle and spent about 90% of my time with that RDBMS for the next 10 years. Developers dread Oracle nowadays – at least if you believe the Stack Overflow surveys – but back in the day it was at the cutting edge of database technology.

    In 1997 I joined database tool vendor Quest Software, where I led a lot of the DB tools engineering. There I kept my hands dirty across a wide range of databases such as MySQL, MongoDB, Cassandra, Hadoop, etc. For a long time I used to be sure there were few databases I that hadn’t played around with, at least a little bit, but alas now there are so many I could never keep up with them all even if I tried.

    What are some of the other writing projects you have done, and what drew you to the CockroachDB Definitive Guide project with O’Reilly?

    I wrote my first book on Oracle performance back in the 90s, and since then, I’ve written about MySQL, MongoDB, and databases more broadly. I’m a DB-technophile: I enjoy working with all the different platforms, whether they are simplistic or sophisticated. But much as I liked the simplicity and – to some extent – ease of use of some of the NoSQL databases, I missed the power of SQL and the transactional guarantees of relational systems.

    So as CockroachDB gained traction, I became excited about the technology and wrote a few articles on CockroachDB for Database Trends and Applications. When Ben and Jesse were looking for a database author to work with them on the book, it was a match made in heaven (I hope Ben and Jesse agree). It was such a great opportunity to work with real experts, and I really enjoyed the process.

    oreilly-definitive-guide-guy

    You had to learn CockroachDB in order to write about it deeply. What was that process like? What surprised you most about working with CockroachDB?

    I sure did learn a lot while writing the book. And the biggest surprise was that everything just…worked.

    CockroachDB is still relatively new and I was expecting to find some rough edges and “under construction” features. What I found is that CockroachDB is virtually feature complete. This might sound like I’m buttering you up, but what truly surprised me most was how much you all have achieved in a relatively short time.

    You have great depth of experience with databases. What are your thoughts about this current evolution of the relational database/distributed SQL?

    I’ve seen four big waves of database evolution in my professional life. Firstly, the pre-relational era with databases such as IMS and IDMS, followed by the second wave heyday of the monolithic relational databases like Oracle and SQL Server. The relational database ruled for more than two decades — this is a huge amount of time in technology. Think about how incredible it is that the first RDBMS was released before the first Personal Computer and is still going strong today!

    In the third wave, we saw a lot of distributed databases like Cassandra and Dynamo that jettisoned transactional consistency to achieve availability and scalability. A lot of significant innovations were incorporated into those systems, but they scrapped some of the best features from the previous decades of database engineering – in particular transactional consistency, the relational data model and the SQL language.

    In what is looking increasingly like the true fourth wave, we see the best ideas from the relational era (SQL, transactional integrity and strong data models) integrated with the best ideas from the distributed database world. CockroachDB is at the forefront of that consolidation – it’s exciting for database nerds like me since it means we can have the best of both worlds in a single platform.

    Any thoughts about the future of data or the database? In your opinion, could there ever be “one database to rule them all” — a single db that can handle every type of workload, and handle it well — or does it just make more sense to use different dbs based on data or workload type?

    Guy: In general, I think the “one size fits all” database era is over and has been for a while. There are a lot of different application types out there and room for a lot of specialized database platforms. But it does seem like we see consolidation into a couple of big segments. The two most important are what we sometimes call “operational” databases like CockroachDB and “analytical” databases like Snowflake.

    Graph databases represent a smaller but probably distinct segment. Many databases are trying to combine graph capabilities with non-graph capabilities, but I think there’ll be a place for dedicated graph databases in the foreseeable future.

    There’s theoretically no reason why CockroachDB cannot be used as a data warehouse, but the team has – wisely, I think – concentrated on making it the best possible platform for more transactional workloads.

    What projects are you working on now?

    Well, I’m taking a break from writing for a while. I wrote two books back to back during the pandemic, so I don’t have much enthusiasm for any more books for a while.

    I have started working on a new project in my spare time that uses CockroachDB as the database platform to index data held in IPFS – a distributed file store that’s often used in conjunction with Blockchain. It’s going great – I’ll let you know when I’m ready to talk more about it!

    We can’t wait to see what you build!

    Ready to get your own copy of CockroachDB: The Definitive Guide? Download the full book for free at [https://www.cockroachlabs.com/guides/oreilly-cockroachdb-the-definitive-guide](https://www.cockroachlabs.com/guides/oreilly-cockroachdb-the-definitive-guide)

    distributed SQL
    cockroachdb
    serverless
    cockroachcloud
    engineering