Ep 33: Monitoring Docker containers with Checkmk

To load this YouTube video you are required to accept advertising cookies.

[0:00:00] Today, we are taking a closer look at our Docker containers.
[0:00:13] Today, containers are everywhere. They are running web services all over the world. They run on different platforms, and they've become quite important to today's IT operations.
[0:00:23] In this video, we'll take a closer look at how Checkmk can monitor your Docker containers running on a typical docker host. And without further ado, let's dive right in.
[0:00:35] Ok so, to quickly summarize what we are going to do. We are going to create a rule that configures the Docker plug-in then we will bake and update the agents, so our Docker host gets the updated agent with the right configuration.
[0:00:50] We need to do one more manual step on the Docker host, a little installation.
[0:00:54] And as the last optional add-on, we will use the DCD, which is an enterprise feature, and also available in our free edition, to automatically create the docker containers as hosts in Checkmk. So, let's take a look how that works.
[0:01:10] For starters, we are looking for the Docker plug-in. That's called Docker node and containers and can be found in the section Agent rules.
[0:01:22] There we create a new rule, and we don't really have to change any of the settings. We want all the information that we can get here, so I'm going to leave all the checkboxes ticked.
[0:01:34] The only thing I'm going to change here for presentation purposes is I'm going to change the hostname that is used for containers to the name of the container.
[0:01:42] This is something you probably don't want to do in production because container names can collide. You can have the same container name twice and creating a host in Checkmk with the same hostname twice just doesn't work. So, this is something I just want to show.
[0:01:58] I want to use to showcase what's happening. In a production environment, you might want to use the ID or the long ID.
[0:02:07] So, now as I only have one host here in my demo environment, I'm going to select it manually. In a production environment, you would probably have a host tag that says that the host is a Docker host and then you would automatically assign the Docker plug-in to all the hosts that you have there.
[0:02:24] So, I'm going to save this. I'm going to quickly activate the changes. And now we are going to bake the agents.
[0:02:34] And in this environment, I already prepared automatic agent updates, which means after the agents have been baked and signed, it only takes a few moments for the agent to update itself and to run the Docker plug-in and then we would get the Docker services there.
[0:02:51] But there's one additional step that needs to be taken. So, let's go to the command line and I can show you what has to be done.
[0:03:00] Okay, so now we are looking at our Docker host. And what we need to do now is to install the Docker library for Python, so our agent is capable of talking to the Docker daemon on the system. 
[0:03:11] And that's quite easily done by running pip3 install docker. That once, we see successfully installed. That's really all there is. We need to do initially on the Docker host.
[0:03:27] That's all there is. As said, the agent will update itself, so we can go back to our Checkmk instance. We look for our host and go to the Service configuration. 
[0:03:44] And after some moments, if we do a rescan, we will find the Docker services. All right, there's our services. Now we can see, the containers running on the server, we see the build cache, the storage used for containers, for images, and for Docker volumes.
[0:04:05] And we get a service that tells us about the general state of the Docker node if everything is healthy and on which host it is running. So, I'm just going to hit accept all. That adds the services to our host.
[0:04:20] And now I'm going to activate the changes, so they actually become active and will be monitored. So, now we can take another look. It will take a few moments for the information to be fetched and we already saw that in the preview.
[0:04:38] So, there's two things which I want to look at right now. One thing is the DCD configuration, I mentioned that earlier, it's an enterprise feature.
[0:04:50] But what the DCD does is the dynamic configuration daemon or in the Setup menu, it's just called Dynamic host management.
[0:04:59] And what that does is it takes the information that the Docker host provides about the containers and creates hosts for the containers in Checkmk itself. So, let's take a look at how that is done.
[0:05:13] I'm going to add a connection here. I'm just going to give it a descriptive name. I want to go with Docker because that's quite good enough. I'm not going to explain everything in detail.
[0:05:24] We're using piggyback data here. We want to add a new element and this tells us where the hosts will be created and how they will be configured.
[0:05:33] So. in this case, it's a very simple site. I'm simply creating the host in my main directory because it really doesn't matter here. Of course, in a production environment, you would do that in a dedicated folder.
[0:05:45] And we can use this default configuration which really only tells Checkmk to not try to query an agent to make SNMP calls or even to ping the container.
[0:05:54] Because obviously, not every container is accessible through the IP network. So, in this case, we only use the data that is coming from the Docker host to evaluate the state of the container.
[0:06:07] We want to automatically delete hosts without piggyback data. That means if a container gets deleted, of course, that's probably something that you intended to do, so you want it removed from the monitoring, that's this option.
[0:06:20] We could also restrict which hosts were added if your containers follow a certain naming concept. For example, it would be possible to filter here and to only add certain containers, but we don't want to use that here.
[0:06:34] Of course, we want to do a service discovery during creation. That means we not only get the host, but we also get all the services that are relevant for the container on the container.
[0:06:45] And the last thing we want to do which is quite important, we want to restrict the source hosts. Because otherwise Checkmk would use all piggyback data that's coming to the Checkmk server. And it wouldn't matter if it's a container host, if it's a vSphere server, anything, really.
[0:07:00] So, here I want to make sure I restrict this to localhost, which is our Docker host, in this example. So, only that data gets used, and we can leave the rest of the options at default.
[0:07:15] So, now I save the configuration and I quickly apply the change. And if everything worked, we see this. So, the DCD actually ran directly after I activated the changes there. 
[0:07:29] And it already created our 2 container hosts here. And if we refresh this page, the DCD also automatically enables these changes.
[0:07:38] So, now we already have our Docker containers monitored in Checkmk. So, now let's take a look how that looks like.
[0:07:48] For Docker, we actually have 2 built-in dashboards. One is the Docker nodes, which refers to the Docker servers. If we take a look there, it's nothing fancy. But it's a easy list to see which hosts are running Docker containers.
[0:08:03] And we also see here the status, the containers, how many are running paused, stopped, stuff like that. The second built-in dashboard shows us the Docker containers and there we can see all the containers that we have, the state of service. And if the data is there, the CPU utilization, the memory used, and the Uptime of those containers.
[0:08:27] And if we take a closer look at one of those, we see the services that are created for the container, which are quite basic services, utilization memory.
[0:08:37] The most important part really here is the Docker container status because that status is also used to determine the host state of the container.
[0:08:44] So, if the container wasn't running, would be stopped. That would be used to determine that the container is down in Checkmk.
[0:08:54] All right, that's it for Docker container monitoring, quite straightforward, of course, that scales for hundreds of hosts and thousands of containers. And you saw it was really easy to get it up and running.
[0:08:54] So, with that, I'm going to conclude the video. Thank you very much for watching. Make sure to subscribe, so you never miss a future video and I will see you around.

Want to know more about Checkmk? Join us for our Introduction to Checkmk Webinar

Register now

More Checkmk Videos