[CASE STUDY}
BILLION ROWS
CLOUD REGIONS
TB OF STORAGE
Storj started building a prototype of their product on PostgreSQL since it’s a battle-tested relational database that many developers are familiar with.
However, they quickly realized that this new object storage system was going to need to scale to billions – if not trillions – of objects, and weren’t sure that PostgreSQL could accommodate this type of growth while maintaining availability.
High availability was also a priority for Storj, in order to meet their customer SLAs around downtime. Additionally, they wanted to ensure that transactions would always be consistent regardless of where they scaled.
In summary, their database requirements included:
Horizontal scale for trillions of objects and exabytes of capacity
Performant and able to retrieve global metadata
Strong transactional consistency guarantees
A reliable foundation with built-in high-availability
Storj also wanted to use a managed solution and a database that would make the transition off of PostgreSQL easy for developers. They did due diligence, evaluating several databases and, CockroachDB was the only solution that met all their requirements.
"Other databases claimed to deliver high availability, consistency, and great performance. But if you look under the hood and if you really test them, they simply don’t have it. We looked deeply at what CockroachDB was doing and at the documentation on the consistency model. We knew that we could use it and trust it because there were experts behind it building on a scientific foundation."
-Jacob WilloughbyCTO, Storj
In 2019, Storj migrated from PostgreSQL to CockroachDB and started to use that as their primary platform. Because CockroachDB is PostgreSQL wire-compatible, the transition was easy, allowing the team to focus on building the product instead of learning a new technology.
Storj uses CockroachDB for several use cases, including billing, account management, and metadata storage. Storj’s object storage metadata is a key component of their system and requires tremendous scale.
The chart below demonstrates how Storj scaled one of their CockroachDB clusters to 5.1 billion rows, using ~5.5 TB of storage. This workload averages around 5K read operations per second and 1.2K write operations per second but gets higher during peak periods.
In terms of a deployment model, they use a Kubernetes engine on Google Cloud and also have CockroachDB deployed in the same regions where they run their API servers. In total, Storj is running in 9 regions. They have multi-region CockroachDB clusters in 3 different continents (North America, Europe, and Asia) with 3 regions in each of those continents.
This deployment is solely based on customer demand, as seen in the diagram below.
Since Storj uses a managed version of CockroachDB, spinning up a new region is easy. It’s a “matter of pushing a few buttons in the web UI, and we have a new cluster up and running in minutes.”
Looking forward, Storj plans to build a globally performant CockroachDB cluster that spans multiple continents, knowing they have the option to run across multiple clouds.
There are a few best practices Jacob (CTO) recommends when considering CockroachDB. He says you shouldn’t try to prematurely optimize for scaling because it will create a lot of unnecessary work.” This is different from other legacy relational systems where capacity planning is an important factor.
For example, a few years ago, they did not have 5.1 billion rows. But when they were ready to scale the load they were able to “just make some easy adjustments to the management console to add additional capacity to the database”.
Finally, Jacob says that Cockroach Labs’ premium support is essential for Storj. He says that whenever they’ve contacted support, the Cockroach Labs team quickly dove in and provided immediate support in order to resolve questions or potential issues.
Having this level of engagement is “very comforting to us and essential to build the availability we need.”
Storj has plans to expand to almost every continent with available clusters. Today, they operate around 23K+ active nodes and are helping customers cut storage costs and carbon footprints by over 80%.
If you are interested in checking out Storj’s solutions, visit their website.
CockroachDB is doing some amazing gymnastics under the bonnet. You might think it's just a simple query, but it's actually going off to different nodes where the data is physically stored to retrieve it. The performance is really good. And then you have this tremendous scaling capability. Using CockroachDB almost feels a bit magic.
-Jacob WilloughbyCTO, Storj
Go hands-on with 100% free CockroachDB Serverless. Spin up your first cluster in just a few clicks.