Episode Thumbnail
Episode 4  |  22:41 min

Why Vulnerability Scores Can’t Be Looked At In A Vacuum

Episode 4  |  22:41 min  |  03.10.2021

Why Vulnerability Scores Can’t Be Looked At In A Vacuum

00:00
00:00
This is a podcast episode titled, Why Vulnerability Scores Can’t Be Looked At In A Vacuum. The summary for this episode is: <p>Sometimes a number is just a number. Context - the information and environment around the number - is what really matters. We discuss how this concept holds especially true in vulnerability management and risk scoring.</p>
Takeaway 1 | 00:16 MIN
Sometimes A Number Is Just A Number
Takeaway 2 | 01:00 MIN
Ed-ception: A Tweet Thread-Turned-Blog-Turned-Podcast
Takeaway 3 | 01:21 MIN
Quick Explainer On Kenna Risk Scores
Takeaway 4 | 01:07 MIN
This Escalated Quickly
Takeaway 5 | 01:46 MIN
Why CVE-2021-1647 Has A Risk Score Of 51/100
Takeaway 6 | 00:45 MIN
D.R.E.A.M. - Distribution Rules Everything Around Me
Takeaway 7 | 00:59 MIN
<5% Of Vulns Are Rated Higher Than 51
Takeaway 8 | 00:27 MIN
Sasha Romanosky Is Always On Point
Takeaway 9 | 03:43 MIN
Comparing to CVSS Severity Distributions
Takeaway 10 | 02:06 MIN
Harking Back To Power Law Distributions
Takeaway 11 | 02:29 MIN
How Distribution Of Risk Applies To Remediation
Takeaway 12 | 00:55 MIN
Components Of The Highest Risk Vulns
Takeaway 13 | 03:21 MIN
Top 8 Features Of A CVE

Sometimes a number is just a number. Context - the information and environment around the number - is what really matters. We discuss how this concept holds especially true in vulnerability management and risk scoring.

Dan Mellinger: Today on Security Science, why vulnerability scores can't be looked at in a vacuum. Hello and thank you for joining us. I'm Dan Mellinger and sometimes a number's just a number. Context, the information and environment around that number is what really matters. This concept holds especially true in vulnerability management and risk scoring. With me to discuss today is the answer of every vulnerability score question ever and the high priest of risk based vulnerability management Ed Bellis. All hail the high priest Bellis. How's it going Ed?

Ed Bellis: It's going wonderful. Thank you for having me yet again, Dan.

Dan Mellinger: Always a surprise what your title is going to end up being. For those following along at home, this might be a quick podcast, but we're basically basing this off a blog that itself was based off of a tweet thread from Ed. I will link both of those in the show notes on kennaresearch.com. Ultimately Ed, we get this question primarily from customers, but every time a new CV hits the news and it's, why is this vuln scored a blank out of whatever. Real quick, Kenna, our risk scores in and of themselves are ranked out of a 100 point scale. Just to orient everyone real quick while we're doing this discussion, Ed, can you explain just at a high level how risk is scored by us?

Ed Bellis: Yeah, yeah, sure. And probably one of the important points is that it is a risk score and I'm sure we'll get into that later. Effectively, what we're talking about and we have couple of different risk scores that we look at, but this is very specific to a vulnerability risk score. And really the vulnerability risk score is highly oriented around the likelihood of exploitation of that vulnerability. We look at a number of different factors about it, including things like, are there any sort of weaponized POCs or even point click and shoot exploits for given vulnerability? Are we actually seeing exploitations in the wild for it? What's the volume and velocity of those things? Do we see malware associated with it? How popular of a target is this vulnerability? What software does it affect? What operating systems does it affect? All of these different things that ultimately boil into that zero to 100 score that you talked about, which is to say, we think this is very likely or less likely to be exploited.

Dan Mellinger: Awesome. And that makes a lot of sense. The distinction between severity and risk. Severity is how bad could this be? And a factor like you're saying with risk is, how likely is that even to happen? I like to think about it like I'm flying an airplane. The severity of an airplane going down would be really catastrophic. The overall risk is very, very low. I think less than being attacked by a shark or I don't know struck by lightning or whatever.

Ed Bellis: Depends on what you're doing.

Dan Mellinger: Depends on the airplane too.

Ed Bellis: Exactly. Yeah. That's a good point. And to be clear, when we're talking about this vulnerability risk score, we are talking about likelihood. Now, how we'd look at that broadly and how it affects assets or groups of assets and things like that, we do look at other things for the broader risk score, but specifically for this conversation, to your point, that started as a tweet that turned into a blog post that is now a podcast, it's all about the vulnerability risk score is what we're talking about here.

Dan Mellinger: That escalated quickly is what you're saying. Well here, let's go to the story of this tweet thread that you put out there. We're looking at a vulnerability, what it came out, I believe in the February Patch Tuesday. It was the Microsoft Defender bug CVE- 2021- 1647.

Ed Bellis: Yeah. Yeah. And to be fair, that was just one of many examples. Your preamble to all of this is we get these questions all the time, which is why is this vuln scored so high? Or why is this vuln scored so low? And usually followed up by, I just looked at it in CVSS. I just looked at it in my scanner score. I just looked at it in somewhere else, which could have been a severity score, could have been another risk score, but they're comparing that score to the vulnerability risk score that we're talking about here and asking questions as to why so high or why so low? In this case, a lot of the questions came around is why so low? Which at the time I think we were scoring that vulnerability a 51 on a zero to 100 scale.

Dan Mellinger: Right. And so when people think about 51 out of a 100, I know I jumped straight to this, that's 50%. I would be failing if I had this score in a class. Explain to us, why doesn't that make sense here?

Ed Bellis: Yeah. Or it's a medium, because it's right in the middle.

Dan Mellinger: Right in the middle.

Ed Bellis: Of zero to 100.

Dan Mellinger: We think in bell curves.

Ed Bellis: Yeah. Yeah, exactly. Except this isn't a bell curve and vulnerability risk is certainly not a bell curve. If you were to talk to Michael Roytman, I'm sure he would give us all a big, long lecture about power laws and how things are distributed within security.

Dan Mellinger: Funny, we just did a podcast on that.

Ed Bellis: Oh, look at that.

Dan Mellinger: Weird.

Ed Bellis: I'll be your set up guy. But the point is, and what I kind of entitled my rant, if nothing else on Twitter was that distribution rules everything around me and distribution here matters quite a bit. Because when I'm comparing, when I have a risk scoring system or severity scoring system or whatever it is and I'm using that scoring system to prioritize my remediation efforts, it's not whether it's a 51 or an 81 or 21 or a 100. It's how many vulnerabilities should I remediate before this one? And how important is it compared to everything else that I have to do? Because I'm going to do this in kind of a prescribed ranked order if you will.

Dan Mellinger: A priority.

Ed Bellis: Yeah. And in this specific case, so you know, spoiler alert, if you look at the distribution of the Kenna risk scoring system, the majority of that is shifted below that. In other words, we think most vulnerabilities don't actually pose that great of a risk or are that likely to be exploited because the data says so. But when you're looking at the overall distribution of the Kenna scoring system, there's probably a large majority of them are kind of in that 25 to 35 or so scoring range.

Dan Mellinger: Range.

Ed Bellis: And roughly less than 5%, I think about four and a half percent, of all vulnerabilities are scored higher than a 51. When I'm looking at that in terms of how do I compare in terms of priorities and what do I fix first? It's actually pretty important. It's more important than 95% of the vulnerabilities that are out there.

Dan Mellinger: That's really interesting. And I do want to give Sasha Romanosky a quick shout out because I don't even think you got through your tweet thread before he asked the set up question. And none of this is planned by the way, and Sasha, we have a cool relationship with him. He helped create CVSS. Dude's super smart. And his question was 51 out of a 100, on what scale? Sasha, you're always on it way too fast.

Ed Bellis: And to his point, the other thing was, is so you're looking at that versus CVSS, but CVSS is a severity score, not a risk score. And that, just as we talked about, severity and risk are not the same. Now, all that said, if I was to use the CVSS as my rank order, if you will, for remediation, well where does that same vulnerability sits on the CVSS scale. First of all, CVSS is zero to 10, but they do use decimals so it's pretty easy conversion. If it's a 7. 2, you can think of it as a 72 on a 100 point scale. And in this particular case, I'm pulling my notes. I think that CVSS score for that was roughly, it was right around there, 7.2, 7. 3 for a V2. And I think the V3 score was just under eight if I recall.

Dan Mellinger: 7. 9, something like that.

Ed Bellis: Right. But if you look at it and said," Oh, well, if I'm going to compare a 7. 2 to a 51 or a 72 to 51, 72 is higher than 51, therefore 72 is more important. That CVSS thinks that this is a riskier or at least a more important vuln to fix than the Kenna risk score." But if you look at the distribution of both CVSS 2 and CVSS 3, they're both shifted very much to the right or the higher end of the scale. In other words, what they're saying is almost all vulnerabilities, certainly most vulnerabilities, are important. In fact, if you sliced out exactly where this falls on the CVSS' V2 or even the V3 score, they say that roughly 30% of all vulnerabilities score higher than this one. Again, if you're just looking at things in a vacuum and trying to compare these apples and oranges together, then you're going to come away with something that's wrong because you're not looking at the full distribution of the score.

Dan Mellinger: There's a delta, there's a difference between CVSS V2 and V3. One thing that I really appreciate on the CVSS V3 kind of front guide, of what changed and all that, they call out. Note CVSS V3 doesn't measure risk, we measure severity. And if you look at the distribution for V2, there's some that kind of rank in at that CVSS two, two out of 10, but almost all of it is four and a half, five plus. And then if you look at V3, it's even more pronounced. Almost nothing is scored less than a three in CVSS. That's V3. Everything is right around four and a half plus from a distribution, which I think that that represents severity. I think you'd be hard pressed to find any security practitioner that says," Hey, this vulnerability has almost no theoretical impact." It can be dangerous, that's why it's listed in the MITRE's list. And CVSS reflects that because it's severity, which means kind of the technical impact that it could have, right?

Ed Bellis: Right. Exactly. And just looking at the CVSS V3 distribution, to your point, almost, not quite 90% it looks like, of all CVEs are CVSS five and above. Effectively, if you're looking at it in that bubble again or that vacuum, then you would say," Whoa, well everything is important then." Because everything is a medium or a high, there is no such thing as a low, if you're just looking at it from a pure zero to 10 numbers scoring system.

Dan Mellinger: Which makes sense. And then, harking back to the power law distribution with Michael, with Kenna, it's a risk score. We're not only at kind of the severity, this is bad. Remote code executions or vulnerabilities are technically not good things most of the time. But you're also looking at the odds or the chances and the activity that's going on around that. And so when we look at the distribution, like you said, 25 to 35 out of a 100 is the majority of vulnerabilities. Because for the most part, we know from some of the work with Cyentia that what? Less than 5% of vulnerabilities are actively used overall, right?

Ed Bellis: Right. Or at least exploited in the wild. And then, you might certainly expand that a bit if you're just looking at things where exploits are available, either through proof of concepts or weaponized in some sort of way.

Dan Mellinger: Yeah. And that creates this very, very long tail of things that is very, very risky, but ultimately lower percentage of things you need to worry about.

Ed Bellis: Absolutely. To liken it back to all of those attributes that we talked about earlier that go into that probability score of exploitation. Which is what we've been talking about, the actual exploited in the wild, which is kind of a baseline. Okay. Yes. The answer is it already is. And then all of the things that proceed up to that point. Weaponized exploits are obviously one of the indicators that are highly indicative of ultimately we will see exploits in the wild for this, which we cover in a lot of the P to P reports. Prior to that you might look and say," Well, what about the prevalence of that vulnerability across enterprises? How many enterprises actually have this vulnerability? That tends to be an indicator. Are there other forms of weaponization like malware?" All those pieces that we talked about earlier are ultimately indicators that build up to an actual exploitation event, which gives us that kind of probability risk score.

Dan Mellinger: Got you. Distributions are really, really important ultimately.

Ed Bellis: I think they rule everything around me, Dan.

Dan Mellinger: I'm going to see if I can link Cream, because that is an amazing song. But it also reflects kind of the realities of patching and remediating and responding to threats as well. Most organizations have a limited pool of resources that they can use to throw at these things as well. That makes the prioritization kind of one of the most critical factors, right?

Ed Bellis: Yeah, no, absolutely. I don't want to ultimately, if I've got a limited number of resources and things that I can focus on my time and attention, I want to make sure that the things that I am focused on are things that are ultimately, have a high likelihood of being exploited. If something is just not going to happen, let's focus on the stuff that not only could happen, but are probable to happen.

Dan Mellinger: Well, yeah. And don't, what do we have that one, remediation capacity. We looked at with Cyentia. Roughly on average, regardless of the size of the enterprise or whatnot, they can handle roughly one out of every 10 new vulnerabilities, high risk vulnerabilities that are introduced in their environments every single month, right?

Ed Bellis: Yeah. And that's the high risk. And the definition there for the P to P reports was any vulnerability that had either an exploit, a weaponized exploit associated with it or that we saw exploitation in the wild for, they were still on average fixing one in 10. Now the top performers we saw were anywhere in the 20 to 25% range, but still that means the majority of them, they're not able to handle to at least in that 30 day timeframe, which we've also found is a pretty critical timeframe, according to some of the other P to P reports.

Dan Mellinger: Ah, that makes perfect sense. And that would also speak to, in this case, in vulnerability CVE-2021-1647, it's in the top 5%, 4. 62% of overall risk. If you've got the capacity to take out 10%, this should be in that first wave of things you're addressing that month, right?

Ed Bellis: That's a great point. If you combine the data sets together and you start to say," Okay, so this is fitting in that top four or 5% and on average, an organization can fix roughly 10% of their security debt in the 30 day timeframe." What we're saying is this is well within your range in something that you should take care of in those first 30 days.

Dan Mellinger: Makes perfect sense. And then I just wanted to go through and kind of close things out because I like the scarcity of the Kenna score. It's always fun to kind of pick out those top ones. At the end of the year, we looked at the top 100 out of a 100, so the worst of the worst and that was a list of 203 vulnerabilities in the last decade.

Ed Bellis: Out of the how many were the total CVEs were there?

Dan Mellinger: Ah, over a 100,000 I think through that time period.

Ed Bellis: Yeah, yeah. I think we had in all of NVD, a little over a 150, 000 of which the vast majority of those were certainly in the last 10 years.

Dan Mellinger: Exactly. Right. And I forget that number off hand, but a 100 out of a 100 ranked was 203 in a decade. That's crazy to me. But I actually did a couple, a little more math. We actually did a blog on this one, which I'll link as well, but it was kind of the top eight kind of high risk capabilities, I guess you could say, of a CVE that makes it kind of pretty risky. Topping out that list at number one is remote code execution. No surprise there, if someone could go take advantage of something externally that really lowers the barrier of entry. Number two on that list was actually memory corruptions. I don't know. I'd love to get your feedback on this. Do you know why that is? Because I am not that technical.

Ed Bellis: Yeah. There's obviously that's kind of the category of the vulnerability. But there's plenty of those that I would also say, there's plenty of examples I could say where you're looking at things that ultimately feed the severity as well. When you look at CVSS, it's very much based on these types of categories, amongst all the things, is it local access? Is it network access? Does it require authentication? All of that sort of thing. And then is it an RC? Is it information disclosure? But then, you could also say," Oh, well Heartbleed was information disclosure." And typically information disclosure is looked at as something, ah, that's not that bad. It's not remote codes. You're not executing code on my machine and you're not getting routes and all of these types of things. No, but in the case of Heartbleed, as an example, that information disclosure happened to be the information that you were trying to protect in the first place, through SSL or DLS. While I wouldn't make too much into those other than they are there for a reason and they do affect severity and ultimately severity should be baked into your risk. But it isn't risk to Sasha's point.

Dan Mellinger: That's a good point. And I would also note, that just reminded me of when we were looking at this, that a lot of these were combined. A lot of the CVEs actually had multiple of these features, quote unquote. RCE plus memory corruption allowed you basically to push arbitrary code to a system that wasn't supposed to have it in the first place. Or it was RCE mixed with a denial of service. You just crash it out and then you get control type thing, right?

Ed Bellis: And to be fair, even when we're measuring risk and we're looking at things and we do some NLP in terms of looking at the language around the description of the vulnerabilities and scraping things like all of the references associated with that vulnerability to see what they're saying about it. And a lot of that ends up in there. You end up gleaning things like, oh, this appears to be a remote code execution vulnerability and that could push things up or down as well. You end up building this model, that to your point, has a lot of different features and those features affect it in different ways.

Dan Mellinger: Absolutely. Well, and one thing I found that it's not surprising, but interesting that there's not a ton of these because it's generally frowned upon, but vendors coding back doors or hard coded passwords into systems that are identified. They don't happen very often, but when they do, they tend to be really, really bad.

Ed Bellis: SolarWinds comes to mind. I don't remember what happened there.

Dan Mellinger: Strange. It's kind of interesting, anyway. Ed, thanks so much. I think this is a good primer on how to think about distributions and context around stuff, especially when it comes to trying to judge vulnerabilities and their inherent risk. Any advice you'd give before we hop off here?

Ed Bellis: Probably one of my favorite quotes is from one of our favorite data scientists, Mr. Roytman who, and I'll just read word for word, what he said about this which is," One of the things no one talks about in security is that we use a decent scoring being a seven, eight, nine or 10. Imagine if every time there was a 20% chance of snow, the forecast said 90%?" that's effectively what we're doing here. Security is basically saying," Everything is always bad and it's just a degree of just how it's either bad or it's terrible and there is nothing below that." We really need to kind of reset into reality, which is that. That distribution, but that power law that basically says there's a very small number of these things that are really bad and really important and that you definitely should take care of, but there's a large portion of these that probably don't matter nearly as much as we tend to think or at least talk about.

Dan Mellinger: Awesome. That helps orient oneself around the overall distribution of risk. Well, thanks so much Ed, for joining us. I do want to call out that you can get( ISC)2 CPE credits on the podcast now. Pretty cool that you can get some continuing education credits just for listening to this session. To do so, you need to go check out kennasecurity. com/ blog. You'll see the link to this podcast there and there's a form fill. You put in your email address and your( ISC)2 number and it'll get you some extra credit there. Thanks so much Ed, for joining us and look forward to our next meeting. Take it easy.

Ed Bellis: Thanks Dan. Appreciate it.

More Episodes

Establishing Defender Advantage w/ Cyentia Institute

How CIOs Get Things Done

Counting CVEs

Vulnerability Disclosure and Responsible Exposure

Risk, Measured: 7 Characteristics of Good Metrics

More Blue Team Voices