Adam Shostack here.
I said recently that I wanted to talk more about what I do. The core of what I do is help Microsoft’s product teams analyze the security of their designs by threat modeling. So I’m very concerned about how well we threat model, and how to help folks I work with do it better. I’d like to start that by talking about some of the things that make the design analysis process difficult, then what we’ve done to address those things. As each team starts a new product cycle, they have to decide how much time to spend on the tasks that are involved in security. There’s competition for the time and attention of various people within a product team. Human nature is that if a process is easy or rewarding, people will spend time on it. If it’s not, they’ll do as little of it as they can get away with. So the process evolves, because, unlike Dr No, we want to be aligned with what our product groups and customers want
There have been a lot of variants of things called “threat modeling processes” at Microsoft, and a lot more in the wide world. People sometimes want to argue because they think Microsoft uses the term “threat modeling” differently than the rest of the world. This is only a little accurate. There is a community which uses questions like “what’s your threat model” to mean “which attackers are you trying to stop?” Microsoft uses threat model to mean “which attacks are you trying to stop?” There are other communities whose use is more like ours. In this paragraph, I’m attempting to mitigate a denial of service threat, where prescriptivists try to drag us into a long discussion of how we’re using words.) The processes I’m critiquing here are the versions of threat modeling that are presented in Writing Secure Code, Threat Modeling, and The Security Development Lifecycle books.
In this first post of a series on threat modeling, I’m going to talk a lot about problems we had in the past. In the next posts, I’ll talk about what the process looks like today, and why we’ve made the changes we’ve made. I want to be really clear that I’m not critiquing the people who have been threat modeling, or their work. A lot of people have put a tremendous amount of work in, and gotten some good results. There are all sorts of issues that our customers will never experience because of that work. I am critiquing the processes, saying we can do better, in places we are doing better, and I intend to ensure we continue to do better.
We ask feature teams to participate in threat modeling, rather than having a central team of security experts develop threat models. There’s a large trade-off associated with this choice. The benefit is that everyone thinks about security early. The cost is that we have to be very prescriptive in how we advise people to approach the problem. Some people are great at “think like an attacker,” but others have trouble. Even for the people who are good at it, putting a process in place is great for coverage, assurance and reproducibility. But the experts don’t expose the cracks in a process in the same way as asking everyone to participate.
The first problem with ‘the threat modeling process’ is that there are a lot of processes. People, eager to threat model, had a number of TM processes to choose from, which led to confusion. If you’re a security expert, you might be able to select the right process. If you’re not, judging and analyzing the processes might be a lot like analyzing cancer treatments. Drugs? Radiation? Surgery? It’s scary, complex, and the wrong choice might lead to a lot of unnecessary pain. You want expert advice, and you want the experts to agree.
Most of the threat modeling processes previously taught at Microsoft were long and complex, having as many as 11 steps. That’s a lot of steps to remember. There are steps which are much easier if you’re an expert who understands the process. For example, ‘asset enumeration.’ Let’s say you’re threat modeling the GDI graphics library. What are the assets that GDI owns? A security expert might be able to answer the question, but anyone else will come to a screeching halt, and be unable to judge if they can skip this step and come back to it. (I’ll come back to the effects of this in a later post.)
I wasn’t around when the processes were created, and I don’t think there’s a lot of value in digging deeply into precisely how it got where it is. I believe the core issue is that people tried to bring proven techniques to a large audience, and didn’t catch some of the problems as the audience changed from experts to novices.
The final problem people ran into as they tried to get started was an overload of jargon, and terms imported from security. We toss around terms like repudiation as if everyone should know what it means, and sometimes implied they’re stupid if they don’t. (Repudiation is claiming that you didn’t do something. For example, “I didn’t write that email!,” “I don’t know what got into me last night!” You can repudiate something you really did, and you can repudiate something you didn’t do.) Using jargon sent several unfortunate messages:
- This is a process for experts only
- You’re not an expert
- You can tune out now
- We don’t really expect you to do this well
Of course, that wasn’t the intent, but it often was the effect.
The Disconnected Process
Another set of problems is that threat modeling can feel disconnected from the development process. The extreme programming folks are fond of only doing what they need to do to ship, and Microsoft shipped code without threat models for a long time. The further something is from the process of building code, the less likely it is to be complete and up to date. That problem was made worse because there weren’t a lot of people who would say “let me see the threat model for that.” So there wasn’t a lot of pressure to keep threat models up to date, even if teams had done a good job up front with them. There may be more pressure with other specs which are used by a broader set of people during development.
Once a team had started threat modeling, they had trouble knowing if they were doing a good job. Had they done enough? Was their threat model a good representation of the work they had done, or were planning to do? When we asked people to draw diagrams, we didn’t tell them when they could stop, or what details didn’t matter. When we asked them to brainstorm about threats, we didn’t guide them as to how many they should find. When they found threats, what were they supposed to do about them? This was easier when there was an expert in the room to provide advice on how to mitigate the threat effectively. How should they track them? Threats aren’t quite bugs—you can never remove a threat, only mitigate it. So perhaps it didn’t make sense to track them like that, but that left threats in a limbo.
“Return on Investment”
The time invested often didn’t seem like it was paying off. Sometimes it really didn’t pay off. (David LeBlanc makes this point forcefully in “Threat Modeling the Bold Button is Boring”) Sometimes it just felt that way—Larry Osterman made that point, unintentionally in “Threat Modeling Again, Presenting the PlaySound Threat Model,” where he said “Let’s look at a slightly more interesting case where threat modeling exposes an issue.” Youch! But as I wrote in a comment on that post, “What you’ve been doing here is walking through a lot of possibilities. Some of those turn out to be uninteresting, and we learn something. Others (as we’ve discussed in email) were pretty clearly uninteresting” It can be important to walk through those possibilities so we know they’re uninteresting. Of course, we’d like to reduce the time it takes to look at each uninteresting issue.
Larry Osterman lays out some other reasons threat modeling is hard in a blog post: http://blogs.msdn.com/larryosterman/archive/2007/08/30/threat-modeling-once-again.aspx
One thing that was realized very early on is that our early efforts at threat modeling were quite ad-hoc. We sat in a room and said “Hmm, what might the bad guys do to attack our product?” It turns out that this isn’t actually a BAD way of going about threat modeling, and if that’s all you do, you’re way better off than you were if you’d done nothing.
Why doesn’t it work? There are a couple of reasons:
It takes a special mindset to think like a bad guy. Not everyone can switch into that mindset. For instance, I can’t think of the number of times I had to tell developers on my team “It doesn’t matter that you’ve checked the value on the client, you still need to check it on the server because the client that’s talking to your server might not be your code.”.
Developers tend to think in terms of what a customer needs. But many times, the things that make things really cool for a customer provide a superhighway for the bad guy to attack your code.
It’s ad-hoc. Microsoft asks every single developer and program manager to threat model (because they’re the ones who know what the code is doing). Unfortunately that means that they’re not experts on threat modeling. Providing structure helps avoid mistakes.
With all these problems, we still threat model, because it pays dividends. In the next posts, I’ll talk about what we’ve done to improve things, what the process looks like now, and perhaps a bit about what it might look like either in the future, or adopted by other organizations.