Dive into a quick history of the CVE List as we kick off a quarterly update that tracks the progress of new CVEs issued.
Dan Mellinger: Today on Security Science, counting CVEs. Hello, and thanks for joining us. I'm Dan Mellinger, and today we're kicking off a quarterly update that tracks the progress of new CVEs issued on this quarterly basis. With me today is the... Wait. What do I... Certificate or CVE numbering arbiter of CVE himself, director of security research here at Kenna Security, Jerry Gamblin. How's it going, Jerry?
Jerry Gamblin: Good. I prefer The Count, one CVE, two CVEs.
Dan Mellinger: The count of CVE. That is awesome. Man, I can't believe I missed that opportunity. Oh, well. We'll keep going. You do get to add CNA to your list of job titles now though. So I do want to kick off with a little bit of history into what exactly CVEs are because we talk about them a lot, I mean, vulnerability management. You're in cybersecurity. You know what CVEs are. I don't know how many people know the full history, so we'll do that in brief detail. But first, Jerry, what got you interested in kind of tracking this number? I understand you have kind of a bet with Ed.
Jerry Gamblin: Yeah. A bet with Ed about how many CVEs there are going to be this year. I think we've covered it in an earlier podcast, the rise of CNAs, which are numbering authorities that are allowed to issue CVEs. And there are a bunch of them this year, and they keep getting added. So we're looking for the number of CVEs to spike I'm guessing above 20, 000 this year.
Dan Mellinger: That's your over.
Jerry Gamblin: That's my over, is over 20.
Dan Mellinger: So what's Ed think it's going to be?
Jerry Gamblin: He agrees.
Dan Mellinger: Under?
Jerry Gamblin: Yeah. It's not a super big bet, but we're expecting it to blow up.
Dan Mellinger: What I find is normally when you've got some kind of a bet going, it makes for the best tracking. Right?
Jerry Gamblin: Yeah.
Dan Mellinger: People really get into it. So anyway, we'll get to the numbers real quick. First, I was just going to kick off. I wanted to do a brief history on CVEs just so we've kind of documented it on the podcast because we've never really gone into that much detail. So roll back the clocks, it's 1999, January. And the original concept for what would become the CVE list, it was presented by the co- creators of CVE, MITRE's Corporation's David E. Mann and Steve M. Christey. So it was a white paper that was entitled Towards a Common Numeration of Vulnerabilities. So at the second workshop on research with security vulnerability database, and that was in Purdue University. So January 1999.
Jerry Gamblin: Shout out to Indiana.
Dan Mellinger: Yeah, Indiana. And from that original concept, a working group was formed. Ultimately, those people would become the initial 19 member CVE editorial board. And it all started, important number to remember, 321 CVE entries. So that's where all of this started way back in September 1999. So how many years have it been now?
Jerry Gamblin: 22.
Dan Mellinger: 22, 22 years, 321 CVEs, so vulnerabilities that existed went from zero to 321 in 1999. And Jerry will tell us how many it's gone to since January of this year. Within a year, so by December 2000, there were 29 different organizations, so Microsoft I'm sure, Cisco, things like that, were actually building products that were compatible with the CVE numbering scheme. So they were basically mapping themselves to this list. And then another significant factor to adoption is security advisory started using CVE IDs, basically to be like, " Hey, people. We know that there's this bad thing out there. Here's the CVE ID. Go reference that list for more detail." Right? From there, it kind of makes sense, right? Numerous OS vendors started including CVE IDs into their alerts to ensure that everyone kind of knew what was going on as soon as it was announced, which is still a best practice. And we just actually recently did a responsible, quote, unquote, slash coordinated disclosure podcast with Ed recently. Another big pickup is a lot of these kind of public watch lists, so vulnerability watch list started using CVE as the basis. So OWASP top 10, web application security started including CVE IDs, for lack of a better story telling. Right? This is how they became named, so we started getting publications, news media inaudible, dark readings of the world are using CVEs and sometimes fun marketing based names for vulnerabilities, but use CVEs as the basis to name and basically provide a reference to any of the vulnerabilities of the day that they're talking about. Woo, from there, we go into the National Institute of Standards and Technology, so NIST, if you work in the government, you know what NIST is. Initially released in 2002 and updated in 2011, they started recommending the use of CVE for any type of software, any of the hardware that had to do with networking or technology. And then in June 2004, the US Defense of Information Systems Agency, DISA, issued a task order for information assurance applications that requires the use of products that use CVE identifiers. So now not only are you encouraged, but you have to use products that use CVE identifiers. And then that'll kind of bring us to where we are today. CVEs are rapidly expanding, and this is primarily due to the rollout of CVE numbering authorities, or CNAs. So while Jerry is the count of CVEs, CNAs are these numbering authorities. And this program is basically, they allow certain software vendors and technology providers to reserve and submit their own vulnerabilities. So every CVE entry added to the list is assigned by a CNA today, so Microsoft, Apple, Amazon, they're all CNAs today. And that list keeps growing and growing, and it's kind of automated, and there's no controls. And almost anything could be submitted, so that kind of brings me to where we are today, Jerry. Jerry, CNAs, how do you feel about that?
Jerry Gamblin: Let's have a rant. CNAs are a great idea. We need more people to be able to report CVEs. The issue is the data quality isn't always the same. Right? I believe there are around 12 fields that need to be filled out for every CVE for the NVD. They're supposed to be filled out and completely filled out before they're submitted, and submitted in a timely basis. Not everybody's doing that. Not everybody has the same definition of what qualifies for a CVE and what doesn't qualify for a CVE. I did the CVE stuffing blog earlier this year, about all those docker container vulnerabilities that came out. The CVE board actually went back and said, " Yeah, we shouldn't have issued those." And they're rewriting the rules on that. Just recently, one single version of a Starbucks mobile app for iOS got a unique CVE, which is an interesting use case. Right? That's kind of where this how many CVEs are there going to be started. If you just think about every time you update your apps on your phone, I think Starbucks updates its mobile app every other day it seems, from my app store.
Dan Mellinger: I would say so.
Jerry Gamblin: So if they started cutting a CVE for every one of those, every time they fix something that's security related, Starbucks applications could have 1000 CVEs on their own. So it really goes back to what you were talking about. The original idea for CVEs were to make a universal tracking data set, so that people could know and understand. And to be completely honest, and we can actually do another blog on this and we probably should, about the data quality, until about 2014, 2015, it wasn't... To see complete data sets in the CVE data was super rare. Right? It was always missing something. About 2015, I think that the government stepped in. Michael knows the backstory because I've talked to him about it before. And they said, " You've got to clean up the data." So I think they hired a bunch of interns. And from 2015 to today, you don't see very many incomplete CVEs in the data set. But back from 2015 to 1999, it's not odd to see just big holes, CVEs with maybe just a name, no description. So they've got better, but the opening of the CNA has probably pushed them back towards the Wild West. Just for a little bit, I'm hoping until everybody can get on the same page of what needs to be in there, and what really should be a CVE and what shouldn't be.
Dan Mellinger: It's interesting you bring that up. When I was looking over the kind of top vulnerabilities of the decade, I was seeing this commonly with Adobe was a good example, actually. Almost all of their vulnerabilities were grouped into these kind of very broad classifications. And you can tell they just cut and pasted the same description for everything that was like, " This is a cross site scripting with remote code execution." I'm like, " That's a big deal. How bad is this?" But they had the same exact description of the vulnerability for every single one of them. It was Adobe in 2001 was a little bit...
Jerry Gamblin: Yeah. It was just something they had to do. They didn't know it was going to go in a note. But it's become the glue that kind of holds the internet and the security world together. We've seen some issues recently with it not updating. Our vuln of the month for Kenna last month didn't have a complete NVD page until three days before we printed it. It was that Windows Defender vulnerability that was patched in February, and we were writing a blog about it. And on March 3rd or 4th, whenever, we're like, " Hey, the NVD data still isn't here." Right? And they finally updated it. But Microsoft had a full CVE information page in the CNA that had all the data. But MITRE had just frozen up and didn't get that data updated in that 30 days. So I mean, it's causing issues in the security community around that quite a bit.
Dan Mellinger: Yeah, yeah. And rumor has it that you're a CVE stuffing blog. So Jerry write a blog. It was on your personal blog. Right?
Jerry Gamblin: Yeah. Yeah.
Dan Mellinger: And tweeted it out, just talking about this, I think your example was a ton of these kind of cross site scripting vulns for different websites and web applications. Right?
Jerry Gamblin: It was a container vulnerability that allowed root, but it wasn't exploitable in most cases. So it was a low level CVE that was part of Alpine OS, and they had patched that.
Dan Mellinger: There you go.
Jerry Gamblin: But somebody went and found every container that had that version of Alpine OS and started cutting CVEs for it.
Dan Mellinger: Which goes back to our discussion on kind of using open source tools and using other people's code. And we know that most code is someone else's code. Right?
Jerry Gamblin: 95% is what we're seeing, yep.
Dan Mellinger: And so if you go back and you use something that has a CVE in it, and now these people are going through and like, " Okay. Let's mine every single web app, piece of software that has ever used this code base," then you can basically to your point, stuff with things that are less meaningful and get a ton of CVEs under your belt, I guess, if that's your main goal. Tell your boss, " I found this many CVEs this month."
Jerry Gamblin: It used to be a badge of honor, and I guess it still is, as being a researcher, having your name connected with a CVE means, hey, you found something. You've done the right cycle. And they fixed it. It means you probably worked through the right kind of disclosure paths. So it's a badge of honor, but it might just be too easy now if there's 25,000 of them a year or whatever.
Dan Mellinger: Yeah. That makes a ton of sense. Well, and rumor has it that blog created a couple day board meetings I think too.
Jerry Gamblin: It's not a rumor. We can link to the board meeting minutes where they talked about it.
Dan Mellinger: Yeah. So Jerry's blog got the CVE editorial board trying to address this issue of whether or not these vulnerabilities should exist the way they are. And do you know if they came up to a resolution on that?
Jerry Gamblin: They decided that they shouldn't be in there, but they're trying to figure out how to write the rules more clearly so that people giving out CVEs know. Right? Looking at it, I mean, it's the problem. Right? Honestly, it's pretty obvious that those shouldn't have been cut. But how do you write a rule that says, " Don't give out these CVEs," and make it understandable to the CNAs?
Dan Mellinger: Well, also, Jerry's changing the CVE process live here. But I think that's some good background, just so the audience has a good ground to understanding of kind of the history in CVEs, and kind of where we're at now, which it's unique because having a lot of CNAs, companies that can issue these. Is it ultimately a good thing? I think Jerry, to your point, there's just some teething issues on: How do we standardize on this? Because some things are prone to human interpretation, so they may not be as cut and dry as we need them to be. But let's get into it. What does the first quarter of vulnerabilities look like, Jerry?
Jerry Gamblin: So in the first quarter of 2021, we've had 2775 unique CVEs issued by NVD.
Dan Mellinger: Wow.
Jerry Gamblin: That is 31 per day, every day of the year.
Dan Mellinger: Okay. So let's go back. The very first kickoff of CVE started with 321 entries. I think if we track back a little bit, we maybe broke 5000 in the early 2010s, now you're telling me we have over 2500 ish.
Jerry Gamblin: Yep, in the first quarter. Do you want to guess how many were in the first quarter of 2011?
Dan Mellinger: 1500?
Jerry Gamblin: 716.
Dan Mellinger: Wow.
Jerry Gamblin: So I think it's a 290% increase from 2011 to 2021.
Dan Mellinger: Wow. That is, so 700 to now. What'd you say the number was again?
Jerry Gamblin: 27, let me... Yep, 2775.
Dan Mellinger: Man. Okay. 31 per day. So I think you have some other stats as well. What is the most published in one day?
Jerry Gamblin: Yep. That this year is, so far this year, is January 20th, where they released 201 CVEs on that day. That was a big day. The 12th and 13th of January also had 127 and 126 released. So they come in spurts. You see those days. I follow the releases on a Twitter account they have called CVE New. We'll link to that. And on busy days like that, they just kind of blow up.
Dan Mellinger: Interesting. Well, and it's funny is you see some of the stuff on patch Tuesday, which kind of makes sense. Right? I would think the busiest day of the month will always be a patch Tuesday, but that's not quite the case here. Right?
Jerry Gamblin: Yeah, no. I don't think that happens very often that it's a patch Tuesday. To be honest, with the data that I've kind of dug through, it looks like the 15th of the month seems to be the day that they really publish a bunch of CVEs. And I'm not sure why that is, but that'd be something interesting to dig into and try to pull that out.
Dan Mellinger: Maybe that's when all the CNAs have all the interns inputting the data.
Jerry Gamblin: That is pizza day in the office and all the interns show up.
Dan Mellinger: Oh, that is interesting. So yeah, because I think we're looking at, you have some heat maps. Right? And so we'll have all of this stuff, by the way, on the Kenna Security blog, but so if you want to follow along, they're going to come out live so you can play with Jerry's numbers and all that good stuff as well. But the heat map, that is very interesting. So right dead pretty much in the center of the month, the 15th seem to be this kind of hot zone for CVEs that are issued.
Jerry Gamblin: So in the history of the NVD, January 15th has had 772 CVEs published. February 15th, it has 712. The 12th of March has had 709. And then you get back to the 18th of January, 700. So it's like that's in the, yeah, first quarter. But that's kind of their big publishing days.
Dan Mellinger: So if things continue this way, do you think we're going to break 20 K this year, to go back to your bet?
Jerry Gamblin: I'm guessing. Over the last five years, 15% of all CVEs have been in the first quarter. If the numbers hold, the numbers roll out to about 18,000 CVEs this year, 18, 500. It would be a little bit more than last year. It would be about 8%, I think is the total over. But this is just guesstimate. And I'm the wrong person to do that. I will just be tracking these as we go.
Dan Mellinger: Absolutely. So we do plan on doing this on quarterly, so we'll have a little update blog from Jerry that comes out every quarter. We may not do a podcast unless something really big pops up out of it because there might be something quarters that have something interesting we want to talk about.
Jerry Gamblin: I mean, I think the most important part of this is releasing this Jupyter Notebook. And we wanted to talk about that a little bit. Right?
Dan Mellinger: Yes. Yeah. So that's where I was going with this. Talking to you, I think the numbers are one thing. It's a good idea, and it's interesting knowledge, just to see how many CVEs are published over the quarter. Is this bigger or less than last year? Looking at the numbers and the growth of CNAs, it's all cool stuff. I think Jerry is kind of embodying Kenna Security, and was talking to me about how collecting and parsing this data is kind of half the fun. So ultimately, what is a Jupyter Notebook, just to start? Because we're going to embed that on a GitHub and all that stuff, and kind of go over how people can use that.
Jerry Gamblin: So Jupyter is just a visual Python script. Right? And the Python script that we're going to release, it has a bunch of panda data. Panda, uses Panda as part of the library. And we pulled down the data from the NVD. Right? They have it public. And I'm not Michael. Michael is our data scientist, and he always will be. But I am a dork, so I love to kind of play with the data. And to be able to release it and say, " Hey, here's the data sources. Here's what I'm doing." And when you put something like this out on GitHub, someone's going to be able to take this and build off it, or take it and find somewhere I made a mistake, and say, " Oh, here. Change this." Right? I just think that it's interesting to be able to release this kind of security data to a wider audience and kind of let them see how it's done, and maybe spark an interest. This is just an open data set. They can change it to whatever they want. The descriptions are in there, so if somebody wanted to write and just pull out all the Microsoft vulnerabilities, they could expand it that way. But it really goes back to kind of middle school math, where it's a lot of show your work. Right?
Dan Mellinger: Yep.
Jerry Gamblin: You can tell somebody what nine times nine equals, but writing down how you got to it is what the math problem is. And that's kind of why I like building stuff like this. I like putting it out on the internet kind of to get people who are interested, kind of at the fringes of data science, like myself, to give them something to look at and something to build off of.
Dan Mellinger: Awesome, awesome. Very cool. Well, that makes a ton of sense. And we're going to be linking to it via GitHub. Right?
Jerry Gamblin: Yeah. We'll put it on GitHub. And Google has a hosted Jupyter server called Colab that anybody with a Google account can use for free, so we'll put a little button on there to launch this in there, and they'll be able to pull it up and run it with no problems.
Dan Mellinger: Awesome. So we normally embed the podcast in kind of its own little mini blog. In this case, we're going to embed the podcast player within Jerry's blog, so you'll be able to see all the data if you're listening to it via the kennasecurity. com/ blog. Go check it out next, well, this should be I think Wednesday's probably when we're going to get this live.
Jerry Gamblin: And when you have any questions, or file an issue on GitHub, or send me a tweet, or whatever, I'd love to help make this data set more useful for everyone. That's part about working at Kenna is kind of giving back to the community and be able to help with these kind of open source kind of fun projects.
Dan Mellinger: Absolutely. Well, and speaking of, I mean, people should be able to also get some ISC Squared CPE credits for listening to this as well. So not only do you get a cool little data set you can play with and then feel free to hit us up if you find better ways to do this, some holes in the work, and/ or make something cool. We'd love to hear about it, so feel free to tweet Jerry, feel free to tweet at Kenna Security as well. And then also, on that blog page, you'll be able to go enter your name, email, and ISC member ID to get some credit just for listening to this podcast. So with that, Jerry, any last final thoughts before we hop off?
Jerry Gamblin: Keep submitting those CVEs so I can win this bet with Ed.
Dan Mellinger: Yes, yes. We must stuff CVEs. All right. Well, thank you so much, Jerry. You have a nice day.
Jerry Gamblin: You too.