Skip to main content

Deploying in the Six Figures and Beyond

Anthony Torrero Collins
Jan 19 - 5 min read

During Dreamforce 2017, Krishna Jagannathan and Hugo Haas explained how our technology teams complete approximately 100,000 deployments per year of the Lightning Platform with minimal customer disruption. See the original talk, and refer to the slides.

By Luisfi (Own work) [CC BY-SA 4.0], via Wikimedia Commons

Our Recipe for you

Delivering the highest standard in system availability, performance, and security is our top priority.

When in comes to providing our customers the best platform to our customers, our top priority is to deliver the highest standard in system availability, performance, and security. To maintain this standard, our deployment philosophy has five main points:

  1. We update frequently (daily) — and infrequently (seasonally)
  2. We use it before you do
  3. You have a chance to test it
  4. Production releases are staged and staggered
  5. Releases have a defined sequence

1. We Update Frequently — and Infrequently

Daily Updates

We call simple issue fixes and low impact UI updates daily updates because of their frequency. They don’t actually happen every day; it’s more like 300 or so times a year. Daily updates keep the Lightning Platform at the highest standard all year round.

Major Releases

The major releases happen three times a year: Spring, Summer, and Winter. Exact rollout days depend on the customer. Each of these releases takes better than 60 days to initiate and complete across the entire infrastructure. A lot of testing and planning precede a major release, because we use these releases to roll out major platform features and new applications. Unlike a daily release, a major release can include database schema changes.

2. We Use it Before You Do

GUS, our Grand Unified System

Our employees are unabashedly vocal when it comes to things not working. So we deploy all updates to our internal instance of the Lightning Platform first. GUS, the Grand Unified System, is used by all development teams in all the ways you expect: Chatter, agile work, case management — the whole enchilada. We bake it in GUS for at least a month (yummm, baked enchilada…) If there’s something wrong, we usually find it pretty quickly. That portion of the update is recycled for more refinement. Back to the kitchen. GUS is always ahead of the curve.

3. You Have a Chance to Test It

Sandboxes

For major releases, after we validate that our GUS experience is correct, we move to the next stage: deploy the update to sandbox instances. Sandboxes are done gradually, starting with our internal sandbox instances, and then to a select group of “preview” sandbox instances customers can opt into. Here’s where you find out about any undesirable effects of the update on custom implementations. As with the GUS stage, issues are removed from the update and routed back to the owning team. When those initial sandbox deployments are successful, we deploy to the rest of the sandboxes.

Note: Daily releases occur frequently enough that intermediate sandbox deployment is not needed.

Production

By the time a feature is staged for Production, it has been vetted by internal employees on GUS, and by customers on the preview sandbox instances. The jump to prod is highly likely to be seamless. In the event of something squeaking by, we follow it up right away with corrective daily releases.

3. Releases are Staged, and Staggered Across Regions

In preparation for deployment, we create a package (an archive containing all files needed for the update). The package is delivered to each data center, unpacked, and staged for installation. Everything is verified in place for the upgrade to begin.

After being staged at the data center, the package is scheduled to deploy during a regionally-specific off-peak time. This low-use period is called a Green Window. As no two regions have the same Green Window, this means that the update is staggered among geographical regions.

4. Each Release has a Safe and Predictable Sequence

Daily Sequence

The sequence of events for daily releases takes about 30 minutes:

  1. For each customer, traffic is directed to half the customer’s application hosts. Through experience, we know that this level of reduction can easily handle the load for the short time that the reduction is in place.
  2. The now dormant hosts are taken off line.
  3. The dormant hosts are upgraded.
  4. The newly upgraded hosts are brought online to accept traffic. At this point, the mix of hosts is on one of two versions — perfectly fine.
  5. Customer traffic is directed to the upgraded hosts.
  6. The now dormant hosts are taken off line.
  7. The dormant hosts are upgraded.
  8. The newly upgraded hosts are brought online, and are enabled to share the customer traffic.
  9. Life is good.

Seasonal (Major) Sequence

The sequence of events for a Major release is similar to that of a daily release. The scope of the updates (which can include the database schema) requires more planning.

  1. For each customer, traffic is directed to half the customer’s application hosts. As stated above, from experience, we know that this level of reduction can easily handle the load for the short time that the reduction is in place.
  2. The now dormant hosts are taken offline and upgraded.
  3. Customer traffic is completely suspended on both the active and recently upgraded hosts. (Downtime is normally 5 minutes or less.)
  4. The database schema is upgraded.
  5. The newly upgraded hosts are brought online, ready to accept customer traffic.
  6. Customer traffic is restored to the upgraded hosts. All active application hosts are running the same (new) version of the platform.
  7. The remaining hosts are taken off line and upgraded.
  8. The newly upgraded hosts are brought online and enabled to share customer traffic.
  9. Life is really good.

So Where Does Six Figures Come in?

Currently, there are 160 active instances of the Lightning Platform worldwide (where “instance” refers to a named, multitenant, cluster of machines running the Salesforce application — check out the full instance list). Each of these has one dedicated ready instance available for immediate switchover. Between daily and major releases, we make over 300 deployments a year. So 160 * 160 * 300 = 96000. Okay, it’s shy of 100K. But who’s counting? (Well, we are, actually. With our customers’ growth, we’ll be there real soon!)

Continuous Innovation sets Our Future

Releasing high quality, beneficial change to production quickly, frequently, and seamlessly.

We have a straightforward and proven process to keep everything at the highest quality. But even though we have a great system, we aren’t sitting on our hands. Salesforce as a company is open about its plans for future growth, and we recognize that for quality, security, and performance to keep pace with our goals, our deployment infrastructure needs to evolve. In the works now are programs to enable fully autonomous system updates that will enable faster rollouts, database schema updates, and much more efficient host switching.

Watch the original talk. You can also review the presentation that was used in the talk.

Related DevOps Articles

View all