The popularity of sports betting, also called real-money gaming, is exploding. And at least in the US, there’s no bigger moment for sports betting than this weekend.
The NFL’s championship game – you know, the game with the name we’re not allowed to say – is likely to be watched by around 100 million people. And with the growing legality and popularity of sports betting apps in the US, it is very likely this game will see more money flowing through betting apps than ever before.
That’s great news for sports betting companies, but it presents a real challenge to the architects and engineers who work at those companies. Building a betting app at all is challenging, given the demands of users with money on the line and the complexity of operating in different states with different regulatory requirements. And building a betting app that can scale to handle “Big Game” traffic without batting an eye? That’s even harder – especially if you want to do it without dramatically scaling up your costs.
So how can you build a modern sports betting application that’s efficient, scalable, and resilient? In this article, we’ll take a look at how to approach architecting a real-money gaming application, based primarily on the architecture of Hard Rock Digital, which currently operates a sports betting app across seven different US states (with an eighth coming soon) and that hands tens of thousands of transactions per second with ease.
Technical requirements for a modern sports betting app
Architecting a sports betting application comes with many of the same requirements that are present with any sort of transactional application. Among them:
Elastic scalability
Betting applications have to be capable of scaling up to handle huge bursts during major sporting events like the Big Game, and then scale back down during the lulls for maximum efficiency. Scale up too slowly and the user experience will be impacted. Scale down too slowly and you’ll likely be paying for more resources than you actually need.
Generally, this means embracing a distributed, event-driven approach. This will allow for fast, smooth horizontal scaling up and down, although fine tuning that scaling to maximize for both user experience and resource efficiency is likely to take some work.
Consistency
In any transactional application involving money and/or where timing matters, consistency is a table-stakes requirement. And for fast-paced use cases such as sports betting, eventual consistency generally isn’t going to cut it. Every element of your system has to be certain of exactly what bets were placed and exactly when they were placed. There can be no disagreement.
This means the ideal database for a sports betting application’s system-of-record is a relational database that can offer ACID guarantees, and ideally serializable isolation. However, choosing a relational database presents its own set of challenges. We’ll dig more into this problem later, but the short-answer solution is to use a distributed SQL database such as CockroachDB, which was built for use in cloud-native distributed transactional applications.
Learn more about why Hard Rock Digital chose CockroachDB.
Resilience
No users of any application like downtime. But going down at the wrong time can be particularly devastating for a sports betting application. Having your application go offline during the Big Game, for example, could cause enough user anger and frustration to permanently tarnish your brand. And in addition to the horrible PR, the financial implications of having an outage at the wrong time can be devastating. Even just a few minutes of downtime can cost millions of dollars.
This requirement, too, suggests the need for distributed application architecture so that there is no single fail point, and one node going down cannot take the application offline. At sufficient scale, it may even make sense to architect the application to survive AZ or even zone failures.
Latency
Almost any kind of application typically aims for low latency because it provides a better user experience, but latency can be particularly important in the realm of sports betting. Some types of sports betting, such as micro betting or live betting, can play out in the span of just a few seconds, so any sort of delay is likely to leave users furious.
This means architecting the application carefully to minimize bottlenecks, and potentially taking advantage of features such as UDFs to minimize the distance data has to travel.
Minimizing latency typically also requires localizing your data, which is a topic worthy of particular focus in the context of sports betting…
Locality
At a certain scale, almost any application will benefit from implementing a multi-region architecture, as it can put the user’s data closer to them, minimizing the impact of latency on their overall experience. But in the context of sports betting in the US, multi-region architecture is even more important because of The Wire Act, which prohibits businesses from using “a wire communication facility for the transmission in interstate or foreign commerce of bets or wagers or information assisting in the placing of bets or wagers on any sporting event or contest.”
Precisely how this 1961 law should be applied to online applications in 2022 is a question of interpretation for a company’s legal department. But for architects, it virtually always means building a multi-region application where specific services and data can be placed within (and restricted to) specific state boundaries.
Hard Rock Digital, for example, operates separate regions for different states so that it can stay in compliance with local rules.
Another challenge: geolocation and the database layer
Achieving these requirements at any kind of scale generally means embracing cloud-based microservices architecture that allows you to spin up local instances of the relevant application services, for the purposes of both improving latency and complying with regulatory requirements.
When it comes to the persistence layer, though, this kind of geolocation is not always straightforward.
Consider, for example, the challenge of using Postgres as your system-of-record database for a US-based betting application. Because data cannot be geolocated within a table in Postgres, you might create separate tables for bets in each state, which would result in the duplication of a lot of data. To ensure these remained available, you’d have to set up some kind of replication, likely an active-passive system. You would also have to write application logic to ensure that bets and other relevant data was routed to the correct table based on the user’s location.
This approach already contains a lot of complexity, as well as the potential for data loss if you’re using active-passive replication. And ultimately, particularly in larger states such as California, scaling it up would require sharding, which then requires additional customized application logic to route data correctly, and adds additional complexity to the challenges of replication. Perhaps needless to say, as you add additional states and shards, the complexity of this approach can quickly make it untenable.
Above, we can see a relatively simple three-state implementation of this. Aside from the potential for data loss with active-passive replication, we can see that there’s already quite a bit of custom logic required to correctly route transactions from the application, and this code will have to be maintained and updated anytime the application is scaled up or down and anytime the company begins operations in a new state or country.
There are other potential approaches, of course. You might also choose to create separate Postgres databases for each state. However, this would require duplicating all the global data (such as tables with the events available to bet on, odds, etc.) in each of these separate databases, which in turn introduces a plethora of potential consistency issues. This kind of architecture would also make user management and customer service more difficult. Users’ info would have to be stored in their state’s database, which could cause problems if they travel or move and attempt to use the application.
Perhaps needless to say, there is a better approach. Hard Rock Digital chose CockroachDB because it allows them to easily geolocate data even down to the row level without having to worry about sharding or writing any kind of routing logic – all of that complexity is instead handled automatically by the database software. Let’s take a look.
Reference architecture
A lot of the architectural complexity required for the Postgres approach detailed above can be simplified with better tooling. In this case, we’ll look at a simplified version of the real-world architecture used by Hard Rock Digital.
(Note that in reality, Hard Rock operates in different states than what is pictured below. They have more than two regions, and more nodes than are depicted in this diagram, which has been simplified for the sake of making it easier to see what’s going on.)
In the setup above, there are separate application instances and database nodes for two different states, Texas and Colorado. Data from both states is also backed-up to database nodes in AWS’s us-east-1 region.
The advantages of this setup over the previous one are numerous, but among them:
CockroachDB allows you to keep a single copy of global tables (market data like events, odds, etc.), rather than having to duplicate this data repeatedly across separate databases for each state.
CockroachDB is a multi-active system, allowing for very high availability with no risk of data loss or consistency issues.
CockroachDB allows for regional-by-row geolocation, and geolocation rules for data can be defined using simple DDL statements. This means that rather than needing complex application logic to route data by region, you can have a single
bets
table with alocation
column and the database will locate each row’s data according to the value in that column, automatically.CockroachDB was built from the ground-up for distributed cloud applications, and it handles all of the routing associated with sharding internally, automatically. This means no application logic to add or maintain when you need to scale up or down. You can simply add or remove database nodes as needed, and your application can treat the database as though it were a single instance of Postgres
. The database will distribute your data efficiently between the nodes automatically, in accordance with the geolocation rules you’ve given it.
Below, we can see an example of how this approach looks in terms of application architecture. Note that no routing logic is needed: the application treats CockroachDB as a single logical database, and all data is geolocated and replicated according to rules you’ve outlined using simple DDL statements.
Data stored in CockroachDB can be easily synchronized to other services using the built-in CDC (change data capture) feature, and even transformed if needed. Hard Rock, for example, uses CDC to push data into Confluent Kafka, and from there into Snowflake.
Learn more about how to build highly resilient and performant real-money gaming applications with our free guide to Data Architecture for Gambling Applications.