Let’s face it, most companies have an incident response plan (IRP) sitting somewhere that was based on some template that is on the Internet or provided by some consulting service and was purely a compliance and/or audit activity. This plan was most likely written to “check the box,” but does not reflect reality nor would it most likely be effective when (not if!) you have an incident.
So, how do you create an effective incident response plan that can actually be used to guide incident response?
The Salesforce Computer Security Incident Response Team (CSIRT) uses and regularly tests our incident response plan. Other companies also leverage our IRP as a model for their own plans. The plan is a living document that is constantly refined. Salesforce has identified 10 steps that companies should take to create their own effective IRP.
1. Define what an “incident” is according to your organization
Before you can start planning for an incident, you have to determine your organization’s criteria for a security incident. First, do some document collection and determine how an incident is defined in current customer contract language, what compliance requirements your organization is under, and any regulations that may dictate what a security incident is for your organization. Next, Sit down and run through some scenarios with key stakeholders (security leadership, business unit leadership, legal, compliance, etc.) and determine whether stakeholders in your organization would consider that scenario to be an incident. Draft your definition and get official signoff from your stakeholders. This definition is key to understanding when you need to invoke your incident response plan.
2. Determine the scope of your incident response plan
You need to consider whether the incident response plan is for your entire company or just a specific environment. Your plan can apply just to a single system, a single business unit, or your entire organization. Whatever your plan covers, you should consider having a centralized incident response plan that all other plans reference. At Salesforce, our overall process is the same no matter what part of the company is impacted, however, the stakeholders involved change based on the environment. Also, does the plan cover unintentional incidents that are not malicious? It should because of the potential impact those incidents can have on your organization. You should determine this before you continue development of the plan.
3. Identify and train your stakeholders
Identifying every single participant in the incident response process can be a very time consuming task, however, it is one of the most critical steps in developing and maintaining your incident response plan. Most of the best practices out there say to make sure you have your contacts in legal, public relations, and human resources, but there are many more. For instance, if your customers are impacted, should you engage your customer success group? What about your compliance team, internal communications team, help desk, physical security, partners, vendors and business process outsourcing (BPOs)? In addition to these relevant groups, should you engage your Red Team? Yes, the Red Team! They know your environment and its vulnerabilities probably better than anyone else in the organization and can be a resource when you are in the middle of a response.
Best practices dictate that incident response contacts, technical contacts, business leadership, and customer impacting groups (legal, PR, customer success) be engaged when necessary for each environment and documented in a case management system. This way, when there is an incident impacting that specific environment, you are able to pull these groups into the response efforts quickly. Here is a brief description of each group:
- Incident Response Contacts: execute technical incident response steps in this environment and have been designated and trained as Incident Responders.
- Technical Contacts: assist with technical tasks in this environment.
- Business Leadership Contacts: participate in the incident response decision process.
- Customer Impacting Group: participate in the incident response decision process for customer-impacting incidents.
To determine your stakeholders, run through some scenarios and determine who you would need to involve to fully detect, respond, and contain an incident. Think about how you would create and distribute internal notifications and external notifications. Who would you need to involve to take extreme containment measures?
Once you have determined all of your stakeholders, make a list and document what specific role they would have in responding to an incident. Gather their contact information in a central location — remember, don’t just get their office phone since there is a good chance that they will be contacted outside of normal business hours in the event of an incident. Reach out to your stakeholders and set up a time to train them on the incident response process so they know what to expect and what is expected of them. You should consider setting up a schedule for routine training refresh of the incident response process (quarterly or annually) as well as a method for updating stakeholders on any changes to the incident response process.
Make sure to review your stakeholder list on a continual basis and keep it updated. You don’t want to slow down your response because the person you had as a stakeholder is no longer with the company.
4. Determine your incident response process
There are a few different models out there that guide how to respond to an incident. In general, they all follow a similar method of Prepare, Detect, Respond, Investigate, Contain, Eradicate, Remediate, and Lessons Learned. Determine what your process will be at a high-level and then take it one or two levels down, detailing what the process really entails under each of those phases.
For example, in your “Detect” phase, you may have the following steps in the process:
- Triage the security report (alert or email report) and determine if it is an incident
- Analyze, Categorize, and Assign: Classify incidents by category, severity and sensitivity
- Create an incident in the case management system
- Assign impacted environment
- Detail what is currently occurring
- Assign incident type
- Assign incident severity
- Determine if it is customer impacting
- Assign an Incident Commander
- Escalate to the Incident Commander
- Bring the Incident Commander up-to-speed on incident
Your process may be different — it should be what works for your organization, but whatever it is, it should be documented and understood by your stakeholders.
5. Develop your severity level definitions
Severity levels drive your response and reflect the impact on the organization. You don’t want to have so many severity levels that it delays determining whether an incident is one level or another. If there are other operational teams in your organization that use severity levels (e.g., NOC, SOC, Site Reliability), you may want to consider aligning with their severity levels so that when you state that an incident is a “Severity 1,” everyone is aware of what the impact is to the organization, whether it is an IT outage or a security incident.
Consider the following when developing your severity levels for security incidents:
- Impact to your brand or your customers’ brands
- Impact to your customers’ and employees’ trust in your ability to provide the confidentiality, integrity, and availability of environments and services
- Level of effort to respond (can your incident response team respond without any other team’s assistance or will it take a lot of resources from the company?)
- Number of employees impacted
- Impact to customers
- Targeted versus untargeted
6. Develop your communications plan and escalation matrix
Understanding how to communicate securely, who to communicate with, and when to communicate is very important when it comes to incident response. Figuring out these logistics prior to an incident is important because during an incident, things can get very messy if there are multiple communication channels. Some things to keep in mind when communicating during a security incident:
- Follow the “need to know,” or principle of least privilege, concept when communicating security incident details.
- Establish a source of truth.
- Streamline communications.
- Keep leadership informed and set expectations for notifications and updates.
- Consider out-of-band communication methods.
Utilize an Escalation Matrix that details who gets contacted, how they are contacted, and when they are contacted. Have leadership sign off on the Escalation Matrix so expectations are set as to when they can expect initial notification and subsequent updates.
For higher level severity incidents, consider using an automated notification system to contact stakeholders to join a bridge to provide them the details of the incident. Once the bridge is concluded, send written communication with the description of the incident, impact, current conditions, response tasks (actions), and any needs the response team has in order to respond effectively. Send updates on a periodic basis until the incident is resolved. Upon resolution, send another email notification stating that the incident is contained and all response tasks are complete.
By identifying and maintaining a stakeholder contact list, you can push out a notification to your stakeholders in seconds, ensuring you are not wasting any precious time with administrative hurdles.
7. Identify and define your incident types to determine your playbooks
Your IRP drives your high-level process, but for detailed processes describing how to respond to a specific type of an incident, you need to have playbooks. Think about your environment. What types of incidents may have impact your organization? There are standard lists out there, but no one list fits all organizations. Things to consider when creating your playbooks: Do you store customer data? What types of incidents may impact that environment? Most organizations should have a malware incident type (or two!), a denial of service type, unauthorized access, and others.
Identifying the types of incidents will allow you to determine what playbooks you need to create. Playbooks for a specific incident type should prescribe the steps to respond and contain 90% of the incidents of that type. There will always be incidents that the playbook will not work for; those incidents are typically higher severity incidents that are more complex in nature.
Below is an example of what your incident playbooks should cover:
- How the incident is typically detected
- What the severity level would typically be depending on specific characteristics of the incident
- Stakeholders and their roles and responsibilities for this specific incident type
- Standard case details
- Resource documents
- Standard Response Tasks with the steps on how to complete each task
- When the incident can be resolved
Playbooks ensure incident handlers, no matter where in the world they are based, are all handling incidents in a consistent manner and that all stakeholders are aware of how we respond to specific types of incidents.
8. Create your Incident Response Plan
Once you have done all the groundwork, you just need to bring it all together in one place. Once the plan is developed, you should provide read-only access to the stakeholders and make sure the most current version is always available to them.
9. Test your Plan
You must exercise your plan to ensure all stakeholders are trained on the process. Additionally, testing the plan helps you identify gaps in your detection and response capability. Testing your plan does not need to be extensive; it can be a 60 minute exercise. The goals of the exercise are to:
- Ensure all stakeholders understand the process and their role
- Identify any gaps in your ability to detect, respond, and contain
- Identify any issues with the current process
In addition, make sure to conduct tabletop exercises when your organization:
- Acquires a new environment
- Environments change significantly
- Key players change
At a minimum, you should test your plan at least once per quarter.
10. Continue to Improve
To make your IRP successful, continue to improve on it. No security incident is handled 100% perfectly. Always conduct reviews of your incidents and determine where changes in the process can be made, where more training could benefit the organization, and/or where additional technological capability could assist in detecting and responding faster.