As a Salesforce employee, I sometimes forget that Salesforce is its own customer, whose customers in turn have their own customers. (Whoa…) Salesforce (the company) is built on and runs on Salesforce (the platform, product, and even culture), and we use a lot of the same Salesforce tools that our external customers do. We track and prioritize our work in Salesforce, and even use it to Chatter about knickknacks that we’re giving away. (Anyone want an action figure?)
In their Dreamforce session on November 7, 2017, three Salesforce leaders (Reena Mathew, Paul van der Staay, and Mike Christian) highlighted how the Technology & Products (T&P) organization uses Salesforce to continuously improve the Salesforce infrastructure and the many exciting products running on it. Read on to learn about the expansive ground that they covered!
Continuous Improvement within Salesforce Engineering
Improving customer experience “requires change continuously, across our different systems, to make our service better,” Vice President in Infrastructure Engineering Reena Mathew said. “Change also comes with risk, so how do we make sure we minimize impact when we roll out these changes to our customers across our systems?” The answer: through the continuous improvement model.
The continuous improvement model, which is informed by the work of W. Edwards Deming, involves four phases, which continuously repeat themselves: plan, do, check, act.
- Plan — Planning the change (for example, maintenance) we want to roll out
- Do — Staging the change, getting it reviewed and approved, and then executing it
- Check — Monitoring for any impact that the change might have on customers
- Act — Determining in a retrospective what we learned, which informs the next planning phase
The Plan Phase
What Guides T&P Work: The V2MOM
At Salesforce, planning starts with the V2MOM, which Senior Release Manager Paul van der Staay called “a business plan for the company and every person at the company.” CEO Marc Benioff makes his V2MOM, and its contents trickle down through the rest of the company so that every employee knows and can clearly articulate how their work supports the company’s overall goals.
- Vision — What impact will you have this year?
- Values — What values are most important to you?
- Method — What actions are required to meet your vision?
- Obstacles — What obstacles might stand in the way of your meeting that vision?
- Measures — How will you determine whether you’re getting your desired results?
Where T&P Manages Its Work: GUS
In T&P, we implement the continuous improvement model in an internal tool called GUS (Grand Unified System), which, you guessed it, was built on the Salesforce platform. It’s integrated with other Salesforce-internal tools, and it informed an AppExchange package called the Agile Accelerator. So if you want to run your engineering organization like T&P, download and install Agile Accelerator today. It’s free!
How T&P Engineers Get Work: Through Investigations and Your Ideas
To get and manage its work, T&P uses several custom objects and their records, all of which are in GUS. Some work comes through investigations, and other work comes from your ideas. (Other work still comes from internal initiatives and programs, but we won’t cover those here.)
Investigations
Say that an external customer sees unexpected behavior in their Salesforce service and logs a case through Salesforce Customer Support. That case comes into T&P in the form of an investigation record, which is assigned to a specific team. The team looks into the issue, and if they uncover a bug, they assign a bug record to whichever team must fix the code in question.
Your Ideas!
Who knows what you want better than you do? In T&P, we listen to your ideas for improving our products and service, and we get a lot of those ideas from our IdeaExchange on the Trailblazer Community. Every major release cycle, Salesforce product owners try to retire as many IdeaExchange points as possible, and when they find an idea they want to pursue, they create one or more user story records to note the details and requirements for making the idea a reality.
The Do Phase
Reporting on Work
At Salesforce, our teams are agile and use scrum, Kanban, or a combination of the two to deliver work incrementally and iteratively. During stand-up meetings, we share the progress that we’ve made on our bugs and user stories and update our scrum or Kanban walls in GUS appropriately.
Releasing Work
Because GUS is integrated with our core source code repository, we can stamp release records when engineers check in code. Each release record shows all the changes that have been made to a specific code branch, so when we want to revisit what was changed in that branch for a specific release, we can easily do that.
The Change Management Process
Want to guess how many changes Salesforce managed last year? Over 48,000. To effectively make and track that many changes, we follow a clearly defined change management process, which is also implemented in GUS. The vast majority of changes are seamless and have no impact on your service. But for those that do, T&P leverages its release records. On those records, we log any release events, which are then automatically pushed to the external-facing Trust website.
The Check Phase
Salesforce monitors its service with symptom-based monitoring and root-cause-based monitoring.
Symptom-Based Monitoring
To see whether a release is affecting any customers, Salesforce uses an internally developed tool called Refocus. Refocus runs on Heroku and uses synthetics to test the health of every service on every instance of Salesforce. (A Salesforce instance is a logically contained unit of the Salesforce infrastructure that hosts Salesforce orgs.) Refocus is also available as an open source project at this Git repo, and some third-party customers are using it to monitor their Salesforce integrations.
Root-Cause-Based Monitoring
For this type of monitoring, Salesforce uses two consoles.
Global Operations Console (GOC++)
GOC++ is an alerting console integrating several different Salesforce clouds.
- Every GOC++ alert is linked to a Salesforce Knowledge Base article, which explains what the alert is and how to act on it.
- From each GOC++ alert, Site Reliability engineers can also jump to a Chatter feed to communicate about the alert and keep any discussions on record.
- “Keeping alerts noise free is a never-ending, very difficult job, made much easier with Analytics,” VP of Infrastructure Customer Experience Mike Christian said. We use those analytics to determine what’s causing the majority of alerts and how to best tune any underlying issues.
Incident Management Console
From a GOC++ alert about a customer-impacting event, Site Reliability engineers can create an incident record in the Incident Management Console. “This is an attempt to bring all of the different pieces of communication required to manage an incident into one place,” Mike said. It unifies incident records, user comments, timers about posting to Trust, and more in a single interface.
Trust Website and Trust Messaging
When we do have an incident, we try to communicate what we know to customers within 5 minutes of detection on the Trust website. On Trust — which is built on the Salesforce platform, hosted on Heroku — you can see upcoming planned maintenance and track any events. You can also click those events to see details about them and sign up for targeted, proactive email notifications.
The Act Phase
When we’ve resolved an incident, we perform a root cause analysis (RCA) and log what we learned on a problem record in GUS. We then take the output of an incident into planned work, which brings us back to the plan phase.
The Plan Phase
Just kidding! We aren’t going to recap the plan phase again. This graphic nicely summarizes the continuous improvement life cycle, though. Consider using it in your own business.
Now It’s Your Turn to Plan, Do, Check, and Act
- Install the Agile Accelerator to track your organization’s engineering work.
- Use the open source Refocus project to set up your own Refocus dashboard.
- Sign up for Trust notifications to be automatically notified about Salesforce service events.
Check out the recording of the How the Salesforce Technology & Products Organization Runs on Salesforce presentation!