This blog post summarizes a Dreamforce 2017 session that was delivered on Wednesday, November 8. To watch that session, check out the recording!
Beyond the shiny concourse of product features and furry critters at Dreamforce 2017, there were smaller, more focused information sessions about the Salesforce infrastructure for the technically-minded customer. These sessions are unique opportunities to hear directly from the leaders who literally keep the lights on at Dreamforce and always maintain the availability of our services above 99.98%.
With millions of daily users and trillions of yearly transactions, maintaining service availability requires a finely-tuned supporting infrastructure, which Salesforce certainly has. But what about maintaining multiple infrastructures? As Salesforce has strategically acquired companies to expand and enhance our services, the leaders of the Infrastructure Engineering team have risen to the challenge to integrate these acquisitions as seamlessly as possible. Service availability is the main goal but a barely secondary one is to integrate smartly, so the resulting service experience is even better. We refer to this innovative effort as Unified Service Delivery.
Talking Shop
On Wednesday at Dreamforce, Dana Quinn, VP of Infrastructure Engineering, and Philip Jefferson, VP of Infrastructure Operations (Commerce Cloud), spoke about this effort in their Continuous Improvements in Service Availability session. Dana has been with Salesforce for 11 months, being a veteran of other prominent tech companies. Philip has been with Salesforce for 1 year, after 11 years at Demandware. As recent additions to our Salesforce Ohana, Dana and Philip delivering the session together was a great example of the rapport that’s been established between our various teams.
To emphasize how Salesforce views the present as The Age of the Customer, Philip explained how Salesforce went from being a mostly B2B company to also being a B2C one with the additions of Marketing Cloud and Commerce Cloud. With that shift, we expanded our support so customers would continue to experience world-class service availability, all the time.
Dana shared our strategy to maintain our core CRM infrastructure while also adding to and improving it. The best option is always considered for any region we expand into, whether that’s building first-party data centers or using a public platform. We continue to refresh instances to support log-based replication, which allows them to be in a ready state (in Read-Only mode) rather than on standby. This supports continuous data integrity checks and enables faster site switching. Network upgrades are also part of our strategy, as are database servers.
Racks and Rows Within the Colos
As a major step towards Unified Service Delivery, some of our acquired infrastructures will become roomies with our core CRM infrastructure in colocation data centers, or colos. These colos will allow our grouped infrastructures to take advantage of shared services, like site switching and disaster recovery while we continue to integrate them into one trusted infrastructure. Philip described how this type of colo will look like with our core CRM infrastructure grouped with Commerce Cloud. Paired with instances in another data center that’s linked by encrypted asynchronous replication, the colos will then be enabled for a unified failover strategy.
Philip provided an excellent example of how we’ll help meet future service availability goals by building on the infrastructure model that he and others established for Commerce Cloud. That model was developed to ensure that all customer transactions are the same whether they occur online, in a retail store, or through a mobile device. The model is also responsive and resilient during peak retail seasons. Security is supported by multiple layers of DDoS defense and by isolating customer data from the network. Infused with Einstein, our resulting infrastructure will be supercharged to support The Age of the Customer.
Detect, Resolve, and Remediate
As a Salesforce customer in a previous role, Dana has a unique perspective. He knows how crucial the processes and tools are that help maintain our infrastructure. He’s one of the leaders of a set of teams whose shared mission is to provide a seamless experience for our customers.
- Global Site Reliability continuously monitors for customer-impacting incidents and quickly resolves them when they occur.
- Site Switching enables our failover and disaster recovery strategies by ensuring continuous service availability.
- Service Management governs changes to our infrastructure and helps manage and eliminate problems when they occur.
- Service Hardening is dedicated to making our infrastructure more resilient by providing fixes and automated solutions.
- Customer Experience Tools creates monitoring, alerting, and incident management tools that help us detect, diagnose, and resolve incidents.
Continuous Improvement
Dana highlighted Continuous Site Switching for our core CRM infrastructure as one of our most recent availability improvements. Continuous Site Switching provides a continuous opportunity to validate the redirection of data traffic between paired instances in the same geographic region. Increased validation means more resiliency and helps our customers meet shifting compliance demands. Supported by log-based replication, these instances are now scheduled to switch in one direction approximately every 6 months during planned maintenance windows. Continuous Site Switching requires no preparation beyond what’s already done for these windows, including following the Infrastructure Best Practices.
Trust Us
The Salesforce Trust status site provides real-time status and forward-looking maintenance info about our core CRM instances. As the customer-facing side of our service availability efforts, the Trust status site is being improved, too. Dana and Philip highlighted our work towards unified status reporting with Commerce Cloud and Marketing Cloud targeted to be unified with our core CRM by Spring of 2018. For subscribers of Trust Notifications, SMS notifications will become available around the same time.
Gratitude
Dana and Philip wrapped up the session by thanking our customers. Supporting customer success drives our commitment to service availability and the improved strategies that enable it.