Dan Mellinger: Today on Security Science, the death of containers? Hello and thank you for joining us. I'm Dan Mellinger and today we're discussing a hot topic in the world of development. Will Dockers download rate limits kill containers as we know them today? Well, I don't have an answer for you, but my guest today sure has some thoughts. So with me today to discuss is Jerry Gamblin. He's Kenna Security's head of security, tinker of all things container and proof that there's an xkcd comic for almost any situation. How's it going, Jerry?
Jerry Gamblin: Hey, how are you. That is 100% true. Yep. There's an xkcd for everything at this point.
Dan Mellinger: And just so the audience knows, we will have Jerry's picked xkcd comic for this episode, hosted on the podcast page at Kennaresearch. com/ podcast. So go check it out. Jerry's always accompanies his podcast episodes with xkcd comics and we always appreciate it. Jumping into the topic today, so Jerry, I know there's a lot of, I guess, kind of strife in the container registry world today brought on by Docker Hub and them basically trying to force people to pay them, finally. So it looks like they want to make some money.
Jerry Gamblin: Yeah, strife. How dare them stop they're free eight terabyte, nine terabyte, free storage for everybody on the internet for the last five years.
Dan Mellinger: Hey,$ 7 a user per month is really expensive if you're making thousands or millions of dollars off of the applications you developed on their platform.
Jerry Gamblin: Correct.
Dan Mellinger: Okay?
Jerry Gamblin: Yeah.
Dan Mellinger: Okay? Okay. So we'll get into that. But first Jerry, what is a container? How are they used in modern development? Why do people care about this right now?
Jerry Gamblin: A container is a small, repurposable VMware for most people. It holds the files you need to do normally one specific thing. A great way to think about it is, standalone website, you can put all the files you need, including the content and your web server Nginx into one container. And that can be your test container that you can use in development and staging and in production. So it allows easy reproduction of that container.
Dan Mellinger: Nice. And you know what? I like this topic because it picks up off of our third- party code dependencies discussion that we had a couple of weeks ago. And essentially, from the way I understand it, a container just Docker Hub, and well now, AWS, Kubernetes, some of these guys are basically trying to standardize how applications are developed and package up the dependencies into essentially a unit that people can use wherever they want. Right? That's the value add?
Jerry Gamblin: Well, containers are built off of a file called a container file or a Docker file. And it's just the rules on how to build it. Like go get this file, build it this way, add these ad- ins and then add my content to it. Anybody can build those locally, but what Docker Hub does, and these other registries that we talk about, it allows the creator of the file to bundle those up and to push it up to a centralized location so that the individual user doesn't have to build those files all the time. They can just do a Docker pool in Nginx and get a fully functioned Nginx container.
Dan Mellinger: Got it. So Docker Hub, for example, sets out some standards that you would need to code to and then if you adhere to that stuff and you can upload it and then other people can use it, essentially.
Jerry Gamblin: That's a whole nother issue, right, is the standards. That's where my side project, vulnerablecontainers. org, comes in, right?
Dan Mellinger: We will link to that as well. That's a fun tool.
Jerry Gamblin: There aren't many standards for them. So that's where you're seeing some of this split off from, and actually Docker themselves has recognized this. They partnered with a company called Snyk to do container scanning for them in their Docker tool now. And that's just coming out this year also.
Dan Mellinger: Interesting. Well, Hey, take a second plug, vulnerable containers. org. What does that do? How'd you build it real quick?
Jerry Gamblin: So vulnerablecontainers. org is a project I launched last year. It pools the top 10, 000 most downloaded containers off of the Docker registry hub, and then scans them for vulnerabilities to show how often people are using popular containers with known vulnerabilities. And it's sometimes hundreds or thousands of vulnerabilities inside the container because there aren't rules or there wasn't rules before on how to stop these containers with vulnerabilities from being on these registries. And there's really still not. And that's a lot of what scares people about the registries splitting and everybody building their own.
Dan Mellinger: Mm. Okay. Well, we'll get into some of the security implications and all that good stuff, but for anyone who wants to go check that out, it's a pretty cool resource. And so we'll host that on the podcast resource page as well. Just getting back to a little fundamental understanding of containers because I have almost zero. What are the benefits of using a container today? Why are they so popular?
Jerry Gamblin: Because they're reproducible. They're the same. So if I build a container and it works on my machine, I can almost 100% guarantee you if you build the exact same container, it's going to run exactly the same. There's an old joke in development. It works on my laptop. And containers kind of break that cycle because it's not dependent on the machine that it's running on. It's dependent on what's inside the container to do the operations. So I know that if I build it on my machine and it works, I can send it to anybody in the world and it will work as designed. Kenna Security actually uses containers internally on an open source project we call, have called, the Toolkit, which allows users to upload data to our platform through our open KTI initiative. And I kind of manage the containers for that. And it's a really handy side project, open source project that Kenna does that people love.
Dan Mellinger: Interesting. So, applications that are developed, they don't necessarily always run 100% on every environment that you have them on. And that's a challenge, right? So what that brought to mind for me was, I know Adobe has done this big push to try to make their creative suite cloud enabled and all that good stuff and make it like iPad pro accessible and all that. And I remember an interview and they're like, yeah, if we were to try to build Photoshop from scratch right now, we couldn't do it. We don't know how it was built. It's been going on for generations, at this point. We've been building on top of this one highly complex application and things wouldn't work the same way. We just couldn't reproduce this kind of software if we wanted to today, in the same way. And that's kind of what containers can solve.
Jerry Gamblin: What you described us as a paint shop, that would be what's considered a monolith in development terms, it's just one giant thing. What containers allow you to do and what Kubernetes allows you to do is to break that down into 100 manageable pieces that can be brought in and out at a different time set to build that one application. And yeah, that's kind of the way it's going. Microservices and containers normally go hand in hand.
Dan Mellinger: Got it. So that's the appeal for modern developers, right, and why containers, in general, have started to become so popular in the development?
Jerry Gamblin: Well, just like Docker wants to make money, containers help companies with this, too. Because since it's the exact same image that comes up every time, building them to scale is super easy. Can't say the bank's name, but there are banks I know that sometimes gets down to running three web servers from midnight to 3: 00 AM, Monday through Friday, because nobody's checking their bank balance there. But Friday afternoon, when everybody is going to make sure their paycheck was deposited or whatever, containers allow them to instantly scale up to 1000, 1500 web servers. And as that goes down over the night, they can bring that down and save the company money. So instead of having to have 1500 servers available all the time for that peak two or three hours on Friday when everybody's logging in to pay their bills, they can just pay for that usage as they need it and then it can come down overnight.
Dan Mellinger: Interesting. So kind of modular and flexible and it allows them to basically scale to need. And I never thought about that. Yeah, you would need to keep all those server resources for the 15th when everyone gets paid and is transferring money around. So, okay. That's interesting. I know this issue is kind of centering around Docker, right? Could you do a quick overview of the landscape? Like what is Docker? Why is Docker important in this conversation and what are some of the other alternatives, I guess, just so audience has that background?
Jerry Gamblin: Docker started out as an open source project that turned into a VC funded company. And that was probably seven or eight years ago. And they tried to make a decent run at it and it didn't grow the way I think they expected it to grow and they were acquired by a company and the company decided to," Hey, let's change the model a little bit." So they're starting to do some things with the source code to make it more profitable for them. But the other big part is the Docker registry was just the default registry all the time. So if you had a Docker client installed in your machine and you did Docker pool you want to, it's programmed by default to go to hub. docker. com and pull down the image from there. And they probably brought in an accountant or whatever, who is looking at their output and said," Hey, we're spending a ton of money on Docker Hub and we just can't keep giving this away." I looked and they said they had nine petabytes of data and their hosting bill every month was outrageous. So they're doing what business does and said," Hey, we got to do a chargeback here and try to make some money." So like you said, they're starting to charge between seven and$5 a developer per month, which is reasonable for the usage. But it breaks a lot of people's workflows because they're saying," Hey, if you have one IP and you're not paying us, you're not signed in, we're going to limit you to 100 downloads a day." Which if I'm at my house just doing some basic dev, I'm likely to not hit 100 pools a day. But if I'm running my workload in the cloud and I'm pooling from there and you doing that thing where I size up to 1500 servers on a Friday, I'm going to do a lot of pools on Friday and I'm going to break it. So that's what led these companies like GitHub who built their whole new GitHub actions around pooling containers from Docker Hub and Amazon who has their elastic container services that defaults to Docker Hub to start saying," Oh no, this is going to break where we're at. So we've got to figure out what to do." And I don't know what the background business was or whatever. So instead of partnering with Docker Hub, if possible, to get around those limits or pay them, they're doing what everybody on the internet does when something doesn't go their way, they're taking their ball and they're going home and they're going to build their own. So, in the last couple of weeks both Amazon and GitHub have announced that they're going to build their own registries. And that goes along with some of the other popular registries out there. Red Hat has Quay. And there are a few more. So you're going to get to the point where there are four or five super popular container registries on the internet. And that just leads to a lot of problems.
Dan Mellinger: Interesting. So these registries are basically just holding and providing a reference point for users to use and download this code?
Jerry Gamblin: For anybody who's been on the internet for a long time, it's the download. com of 2020.
Dan Mellinger: Gotcha. And how did Docker become so popular that now they turn on some paywalls essentially for usage and now everyone's scrambling to create their own? What drove Docker in a position where they were so widely used overall?
Jerry Gamblin: Well, they wrote the software. And then they came up with the registry. So it's just been the default. So since day one, the default Docker registry has been hub. docker. com.
Dan Mellinger: Gotcha. And they've basically been free up until this point?
Jerry Gamblin: Yep. 100% free, no limits. They've had some limits on if you wanted private registries or they tried to do like, hey, if you want to do vulnerability scanning on your images, it costs 10 bucks a month, some add- ons like that, but just base public images and downloads and pools have always been free on Docker Hub until November 1.
Dan Mellinger: Interesting. Okay. And so they were trying to monetize, it looks like, through some added services, stuff like that. Ultimately, they're like," Hey, it's costing us a lot of money to store and host all this stuff that you guys are pulling constantly. We're going to start charging." I think, what, it's still free, like you said, if you did under 100 pulls a day. The next tier up is pro I believe?
Jerry Gamblin: Which is$ 7 a month. And then they have a teams option where if you work at a company and you have five developers, it goes to five bucks a month per dev.
Dan Mellinger: Gotcha. So between five and$ 7 for basically anything that receives any sort of volume that would indicate it's being commercially used?
Jerry Gamblin: Yeah. Correct. Yeah. Yep.
Dan Mellinger: Okay. That makes sense. And now you're saying, GitHub and AWS are going like," Okay, we're just going to take this and create our own. So people can just use it. We don't like that people are building applications on AWS and they have to refer to Docker. Now these people might not do that because they don't want to be charged. So we have enough money. We're AWS, right. We're Amazon. We're going to build our own and not have limits on it right now?"
Jerry Gamblin: That's what they're saying. We can't figure this out. We don't want to get around it. You'd hope they had some kind of business discussions, but nobody's privy to those. So it's," Hey, we're going to build our own container registries."
Dan Mellinger: Oh. Well, that sounds fun. So typically you want to think the internet, more choice, good thing. What are some of the implications of GitHub and Amazon building their own alternatives, especially this late in the game it seems like?
Jerry Gamblin: It's going to be a, what we like to call, copycat or name squatting issues that are going to come along on the security side. Think of medium size kind of open source projects that's fairly popular, like a fluentd, which is a logging platform. It really doesn't matter. But if they're not on top of their game and go and get their name on this new Amazon registry, the new GitHub registry and the Docker Hub registry, and somebody else comes in and squats on that, they can easily upload something that looks exactly like the legitimate image, but just slide in some malware and get that distributed. Or, just the other way, you have all of these registries and you have to try to keep them all up to date and you miss something and then people are using vulnerable software. Choice is good until somebody has to actually manage all the choice and figure out a way to keep it all secure.
Dan Mellinger: Makes sense. So that's interesting. So kind of like the early dotcom days where people were just buying every URL, every domain name that they thought could be popular and then...
Jerry Gamblin: Yeah. Or the other way, when a new TLD comes out, right? Like you have. com, but at my last role,. sucks came out and, everybody wanted to make sure that we owned that so that nobody could buy it. So, there's a big scramble to spend the$10, 000 for that domain name or whatever. It's that trying to keep up, so if somebody goes somewhere where they're expecting your software to be, and they pool it, they're actually getting your software and not getting a competitor's or somebody who's has something nefarious in mind.
Dan Mellinger: Yeah. And to be clear, I think the main challenge you stated as well was there's a lot of existing open source projects that are on Docker Hub today. And so their are people know what they are, they're reference, they use them in their code. Maybe they have an AWS backend or they develop on AWS. They hear AWS isn't going to charge them, but someone else can come in and do AWS. popularopensourceproject. com and not be that. Because these are being created now into a market that's relatively defined. People are looking for specific things from Docker. They're going to try to find it in AWS. So it gives an opportunity for someone to go make some copycat, a look alike, that is more nefarious or nefarious, nefarious?
Jerry Gamblin: Nefarious.
Dan Mellinger: Nefarious.
Jerry Gamblin: Yeah.
Dan Mellinger: On GitHub's or AWS' registries. That's what you're talking about?
Jerry Gamblin: Yep. 100%. That's kind of the long and the short of it, is that's kind of what we're worried about. And just to make sure that it continues to keep security in mind and making sure that these new registries are thinking about security at the forefront as they bring them up. I'm primed for a good registry security tools race, but I don't know if that's going to be what's going to happen. I don't think any of these three companies are particularly security first minded. So, I'm worried that these registries are going to launch with an MVP type of security project, a minimum viable product, not a most valuable.
Dan Mellinger: Yeah.
Jerry Gamblin: Sorry. That was some product talk sliding in there.
Dan Mellinger: Yeah. So MVP means minimal viable product. So the bare minimum that you could launch something and have people not completely pissed off that it's a product that exists in the world basically? So you're saying you could see a situation where GitHub and AWS are launching security services, I guess in this case, right?
Jerry Gamblin: Yeah.
Dan Mellinger: For these container registries that are not really fully baked?
Jerry Gamblin: Yeah. I'm guessing that if you were in roadmap planning for these companies in January, launching their own registry service wasn't on anybody's list. And it's kind of happened over the last year as this has kind of shook out a little bit.
Dan Mellinger: That strikes me as odd overall, as well. You'd think companies like Amazon and GitHub would recognize that kind of dependence and be like," Hey, we might want to come up with some kind of exit strategy should a situation like this happen." Being very reliant on basically one key player is not typically a good strategy overall. So I'm actually surprised that... and it seems like they're all scrambling to create a new service, new registries, new product as a result of this news.
Jerry Gamblin: Yeah. Signal points of failures are normally not obvious until they're really obvious. And I think that that was one of the cases here.
Dan Mellinger: Well, yeah. I guess that's a good point, hindsight.
Jerry Gamblin: Yeah.
Dan Mellinger: So what are some best practices for people who might consider migrating from Docker to AWS, to a GitHub, to one, avoid some of these security issues?
Jerry Gamblin: Just make sure you know where your containers are coming from and be super vigilant about that over the next year. Make sure that your devs know about it. Have a good testing program. Don't just buy what these companies say like," Oh, we've ran a security check and it's safe." Build those checks yourself into your CIC platform and before you put something into production, would be my last word on this. Just to make sure you have your own visibility, make sure you understand how you're building your stack, because at the end of the day, you're responsible for it. While it would be terrible if you pulled nefarious container from somewhere that had malware in it, at the end of the day, it's probably not going to fall on the registry as their fault. It's going to be your fault for pulling it and loading it.
Dan Mellinger: What are some ways to scan and/ or determine how vulnerable, even popular or maintained containers? How do you judge the security of said things? Because I know you build vulnerable containers, As one of the responses to this as an open toolkit.
Jerry Gamblin: Aqua Security has a tool, an open- source tool called Trivy that I absolutely love. There are plenty of commercial tools out there. And at Kenna, we've started to support more and more of those. I don't spend a lot of time in the commercial tools, but just anything that will give you visibility into it. But there is Docker Tool Bench, which is open source and free and Trivy, which are the two that I always point people towards looking at when they ask me what kind of tooling should I start with?
Dan Mellinger: Awesome. That's good to know. And then if you were to identify vulnerabilities in containers, what are some of the ways you can alleviate that? You still want to use the container? You know it has a couple of vulnerabilities you're not comfortable with, what do you do if you still want to use it?
Jerry Gamblin: You can normally patch those in your end container or remove the services. Or a lot of times, sometimes, it's just as simple as going back to the people who maintain the upstream container and just letting them know. Opening a bug or saying like," Hey, can you patch this?" To be honest, I just did the same thing with GitHub. They have a bug in their latest super linter and it's stopping a deployment for me. So I opened up an issue and their people said," Hey, we're fixing that." So it's really just knowing where your software comes from, knowing how to get support and knowing how to talk to the people who manage that software and those containers.
Dan Mellinger: Yeah. That's interesting. I never realized how heavy this interpersonal/ community element was in application in software development overall. You brought that up in the third party risk episode, right, some guy in Nebraska, maintaining the code and hit him up and be like," Hey, would you mind patching this? Because we'd like to use it."
Jerry Gamblin: Exactly. It really, really works that way in open source quite a bit. And a lot of times, if you can, you can go in and patch it yourself, like if the containers on GitHub and I know what's wrong, I'll go in and open a pool request and they'll accept it. And then I fixed the issue, not only for myself, but for everybody else who uses that software.
Dan Mellinger: Awesome. I think this was super helpful. A nice quick overview on containers in general and the security implications of Docker, God forbid, charging money for their services. As we finalize things, where do you see things heading right now? Do you see Amazon and GitHub, for example, being successful in this? Do you see this kind of fragmentation or do you think things will...
Jerry Gamblin: Yeah, that's a whole nother episode, to be honest. So while Docker is still the leading container software on the internet, the Linux Foundation has started a project called Podman, which is a open source, version of an open source software. And it started to ship by default on some Red Hat instances. So Docker might go the way of Kleenex. Everybody says Kleenex, but it's not the Kleenex brand all the way, all the time.
Dan Mellinger: It's tissue paper.
Jerry Gamblin: Yeah. So I think that we're at a tipping point on which way this could go. Docker is still the most popular open source container software, but there's nothing to say that that's going to be the same in six months or a year, if we came back and visited this.
Dan Mellinger: Awesome. Well, it looks like we got another topic we can talk about coming up. Thanks, Jerry. Any final words before we hop off?
Jerry Gamblin: No. Nope. Just be safe and scan your containers.
Dan Mellinger: Scan your containers. Awesome. Well, thanks Jerry. Just for everyone listening, I will link to all the resources Jerry mentioned. So, a nice plug for Aqua Security, Docker Tool Bench. I'll link to the Podman And of course the xkcd comic. So, thanks everyone. Have a nice day.