Effective cybersecurity starts with a skilled and empowered team. In a world with more remote workers and an evolving threat landscape, you need creative problem solvers defending your organization. Unfortunately, many traditional security organizations operate in a way that discourages growth, leading to burnout and high turnover.
Sixty-six percent of IT professionals say they have considered finding a new job with less stress. Fifty-one percent are even willing to take a pay cut. And the average tenure of a cybersecurity analyst is only one to three years. Even if stressed employees don’t quit, they may become cynical or lose focus, putting your organization at risk. Given the huge talent shortage—estimated between one and three million cybersecurity professionals—it’s critical to understand some of the factors that lead to burnout, so you can grow and retain your team. In this blog, I’ll provide insights into what drives burnout and walk through recommendations for using automation, training, and metrics to build a more effective security organization.
Burnout in the security operations center
Burnout starts with a vicious cycle. Because management has a limited budget, they staff many of their positions with entry-level roles. Security organizations are inherently risk-averse, so managers are reticent to give low-skilled roles decision-making authority. Security professionals in such an environment have few opportunities to use creative problem-solving skills, limiting the opportunity for them to grow their skills. If their skills don’t grow, they don’t advance and neither does the organization.
This cycle was documented in 2015, when Usenix studied burnout in a security operations center (SOC). By embedding an anthropologically trained computer science graduate in a SOC for 6 months, researchers identified four key areas that interact with each other to contribute to job satisfaction:
- Skills: To effectively do their job, people need to know how to use security tools where they work. They also need to understand the security landscape and how it is changing.
- Empowerment: Autonomy plays a major role in boosting morale.
- Creativity: People often confront challenges that they haven’t seen before or that don’t map onto the SOC playbook. To uncover novel approaches they need to think outside the box, but creativity suffers when there is a lack of variation in operational tasks.
- Growth: Growth is when a security analyst gains intellectual capacity. There is a strong connection between creativity and growth.
Graphic from A Human Capital Model for Mitigating Security Analyst Burnout, USENIX Association, 2015.
To combat the vicious cycle of burnout, you need to create a positive connection between these four areas and turn it into a virtuous cycle. Strategic investments in growth, automation, and metrics can make a real difference without requiring you to rewrite roles. Many of these recommendations have been implemented in the Microsoft SOC, resulting in a high-performing culture. I also believe you can expand these learnings to your entire security organization, who may also be dealing with stress related to remote work and COVID-19.
Create a continuous learning culture
Managers are understandably wary about giving too much decision-making authority to junior employees with limited skills, but if you give them no opportunities to try new ideas they won’t improve. Look for lower-risk opportunities for Tier One analysts to think outside set procedures. They may periodically make mistakes, but if you foster a culture of continuous learning and a growth mindset they will gain new skills from the experience.
To advance skills on your team, it’s also important to invest in training. The threat landscape changes so rapidly that even your most senior analysts will need to dedicate time to stay up to date. The Microsoft SOC focuses its training on the following competencies:
- Technical tools/capabilities.
- Our organization (mission and assets being protected).
- Attackers (motivations, tools, techniques, habits, etc.).
Not all training should be formal. Most managers hire junior employees with the hope that they will learn on the job, but you need to create an environment that facilitates that. An apprenticeship model provides growth opportunities for both junior and senior members of your team.
Support operational efficiency with automation
At Microsoft, we believe the best use of artificial intelligence and automation is to support humans—not replace them. In the SOC, technology can reduce repetitive tasks so that people can focus on more complex threats and analysis. This allows defenders to use human intelligence to proactively hunt for adversaries that got past the first line of defense. Your organization will be more secure, and analysts can engage in interesting challenges.
Solutions like Microsoft Threat Protection can reduce some of the tedium involved in correlating threats across domains. Microsoft Threat Protection orchestrates across emails, endpoints, identity, and applications to automatically block attacks or prioritize incidents for analysts to pursue.
Azure Sentinel, a cloud-native SIEM, uses machine learning algorithms to reduce alert fatigue. Azure Sentinel can help identify complex, multi-stage attacks by using a probabilistic kill chain to combine low fidelity signals into a few actionable alerts.
It isn’t enough to apply machine learning to today’s monotonous challenges. Engage your team in active reflection and continuous improvement so they can finetune automation, playbooks, and other operations as circumstances change.
Track metrics that encourage growth
Every good SOC needs to track its progress to prove its value to the organization, make necessary improvements, and build the case for budgets. But don’t let your metrics become just another checklist. Measure data that is motivational to analysts and reflects the successes of the SOC. It’s also important to allocate the tracking of metrics to the right team members. For example, managers rather than analysts should be responsible for mapping metrics to budgets.
Time to acknowledgment: For any alert that has a track record of 90 percent true positive, Microsoft tracks how long between when an alert starts “blinking” and when an analyst starts the investigation.
Time to remediate: Microsoft tracks how long it takes to remediate an incident, so we can determine if we are reducing the time that attackers have access to our environment.
Incidents remediated manually and via automation: To evaluate the effectiveness of our automation technology and to ensure we are appropriately staffed, we track how many incidents we remediate via automation versus manual effort.
Escalations between tiers: We also track issues that are remediated through tiers to accurately capture the amount of work that is happening at each tier. For example, if an incident gets escalated from Tier One to Tier Two, we don’t want to fully attribute the work to Tier Two or we may end up understaffing Tier One.
As organizations continue to confront the COVID-19 pandemic and eventually move beyond it, many security teams will be asked to do more with less. A continuous learning culture that uses automation and metrics to encourage growth will help you build a creative, problem-solving culture that is able to master new skills.
Read more about Microsoft Threat Protection.
Find out about Azure Sentinel.