The Crystal Shard: How Salesforce Scales

Today on the Architecture Files, we’re going to start delving into some of the things we do to scale Salesforce’s technology stack. Specifically, we’re going to talk about sharding, and how we do it.

Why do you care about that? I don’t know … maybe you don’t! It’s a nice day outside, and I’m not here to tell you how to live your life. BUT, maybe you’re a tiny bit curious about how we keep up with growing so fast, or maybe you’re an engineer at a company that’s facing similar problems. Or maybe you’re my mom, and you read everything I write. (Hi, Mom!)

I should preface this by saying that Salesforce is many things. It’s a big company now, and over the years it has expanded to lots of different lines of business, technology platforms, etc. When I delve into technical details in this series, I’ll try to be specific about which part of Salesforce I’m talking about. In this post, I’m talking exclusively about what we call “Core”: the part of the product that’s been around since the beginning and powers the Sales, Service, Platform, and Communities products. (No worries if you don’t know what those are, that’s not important for this post.)

Growth is something we think about a LOT in the software world, and it’s hard to get right. A common experience for young companies is, they’ll write some nice code that does something cool, only to discover that when they start to push it too hard — too many users, too much data, etc — then it breaks. Horribly. Many (most?) companies experience this pain in their early days.

To stick around for the long haul, your customers have to love you (in a Platonic way). And that requires that they trust your service to be stable and responsive, even under heavy load. If it is, they’ll tell their friends, who will tell their friends, and then you have more customers … which means you have more load, and have to scale even more, etc. It’s a virtuous cycle, powered by customer success.

But, scaling well is tricky! At least, it can be, depending on what you do. If your company’s product is just a bunch of pictures of dogs in hats, scale might not be a big deal. But if your service is, say, hosting complex database transactions for mission-critical business apps, it can be a little more daunting. You want your transaction rate and data volume to be able to increase rapidly, without giving up performance or availability. This is a challenging set of constraints to satisfy all together. It’s like wanting to be smart, funny, and good looking: only Tina Fey can really do it.

You might think that growth is just a matter of buying bigger, faster computers (also known as “vertical” scale). That might work for DogsInSillyHats4U.com … but that doesn’t fly at real scale, for a couple reasons:

There’s not a big enough computer. Even if you get the world’s biggest computer (which is, clearly, the self-aware computer from the end of Superman 3), sooner or later you will outgrow it.
The bigger they come, the harder they fall. When that one big computer does crash, it’ll probably take a long time to recover. And if it fails permanently, you’re in even bigger trouble.

So instead, just about every modern architecture distributes workloads across more computers (rather than just getting bigger ones). This is what we call “horizontal” scale: scale out, not up.

Horizontal scale immediately opens up all kinds of gnarly problems, like how to keep a large pool of computers in a correct state, despite the laws of physics and the probability of individual system failure. Distributed systems research has taught us much over the years, but the main takeaway is that it is hard. Like, EGOT hard.

But, there’s actually one method of horizontal scaling that, if you can do it, skirts most of the really hard issues: sharding. We do that, a lot.

What’s a shard? (Other than the all-important missing piece of the Dark Crystal, obvs.) In software terms, a shard is simply an independent subset of something: servers, customers, data, whatever. Using it as a verb, to “shard” your infrastructure means to divide it up into chunks that can operate independently of each other, without constant coordination.

Sharding was (and still is) a pretty common thing to do with large databases. Instead of one honkin’ big database, you break it up into N databases, each of which has 1/N honkin’ness. Depending on how transparent this is to the program code, it can be a good solution, or it can be a big hassle. You generally do it by identifying discrete subsets of data that don’t need to interact with each other continuously. That could be different customers, different products, different lines of business, etc. The key thing is just that these subsets are independent of each other, and each can behave as if it were the only one in existence.

For some types of systems, sharding isn’t a great option because there is no clean way to slice up the data into independent chunks. Picture a social network: there’s no easy way to divide up the users, because in real life, anyone can be friends with anyone else (well, except cat people and dog people, which is intractable as far as I can tell). There’s no clean division, like “Put all the people with last names starting with A–M on this shard, with N–T on that shard, etc”. Conceptually, it’s one big social graph, so naive sharding isn’t typically a good solution.

Salesforce, on the other hand, is a multitenant system. That means we actually have a really simple way to do it: we shard by customer organization. The organization (or “org” for the cool kids) is a totally self-contained unit. No data ever traverses the boundary between orgs, except by coming out the front door of one org and in the front door of another, using proper authentication. So, the placement of an org is very flexible, and independent of all other orgs. We can divide them up and bin-pack them into the physical instances in the most efficient way for us. It’s great.

However, for the Salesforce Core products, we take an even more radical approach than just sharding the database: we actually shard nearly everything. We take our whole set of infrastructure components (application servers, databases, networks, load balancers, proxies, caches, etc) and we make N copies of that whole stack. These self-sufficient copies, which we call “Instances”, all run the same software, but they have different customer orgs–different data and workloads. The instances don’t really have much to say to each other; when you interact with Salesforce, by and large, you interact with just one instance. This makes them horizontally scalable: “crunch all you want, we’ll make more”.

How many instances do we have? The number is always changing, but as of this writing it’s upwards of 100 (as you can see on our trust.salesforce.com site, where we show live info about the availability of all of them). In addition, though, we actually run standby instances for all the main instances as well, in the event that some unexpected event causes us to have to switch over rapidly; more on that in a future post.

Sharding into instances is something we’ve been doing for many years, and it’s served us well. The main benefit is that it allows us to maintain the coherence of a single master location for any given org. (If you’re familiar with relational databases, you know that it’s particularly tricky to run “multi-master” systems, where mutations to data can be handled in more than one independent physical location. Specifically, that’s tricky because even electromagnetic waves take a little time to get from one side of the globe to the other, and that opens up a window for inconsistency during simultaneous updates.) We sidestep that whole set of problems by maintaining a single master location for any org at any time, and doing so has kept the basic complexity of our database architecture comparatively low.

There are downsides, too, naturally; nothing in software comes without trade-offs (except hashing … that’s just pure magic). One misstep we made early on was actually exposing the instance name to customers in the URL (na1.salesforce.com, eu2.salesforce.com, etc), rather than making that an internal implementation detail. That’s prevented us from being able to move orgs as fluidly as we’d like, because it has an impact on them: they have to update their bookmarks, IP whitelists, etc. (If you’re a customer and you’re reading this, for the love of Gelflings, please use MyDomain!). There are also still a few processes (like Login and package distribution) that are inherently not shard-able, and those do build up complexity for us; we do end up having to sync some data globally across all instances, and that’s tricky to get right. And, in hindsight, we sometimes wish we’d used a multi-tiered sharding strategy, rather than a single dimension that shards all the layers of the stack.

Those are all tractable problems, but if we could start over now, those are probably things we’d change.

Sharding is alive and well in other parts of the Salesforce universe, too. As an example, Pardot (our app for Marketing Automation wizardry) also uses a sharded database architecture. Unlike Salesforce Core, Pardot shards only the database, not the rest of the stack. This resulted in a bit more complexity within the application (knowing which DB to talk to for each tenant), but a net win on overall efficiency. (We’ll say a lot more about Pardot’s sharding strategy in a future post.)

Sharding isn’t the only thing we do, of course. In fact, as we evolve our infrastructure to use more sophisticated distributed systems, it’s becoming less important over time. We’re seeing systems that are inherently horizontally scalable (like Apache HBase) depart from this model because they handle the hard problems — replication, consistency, fault tolerance, etc. — internally, without requiring any explicit sharding from the application developer. (They also pose new problems of their own, of course; more on that soon.)

But, on the whole, sharding has been a good friend to our architecture, and it’s a big part of how we’ve scaled to date.