Ep. 32: Working with the Agent bakery in Checkmk

To load this YouTube video you are required to accept advertising cookies.

[0:00:00] Welcome to the Checkmk channel. Today, we are taking a closer look at the Agent bakery.
[0:00:15] The agent bakery is an enterprise feature which is also available in our free edition and it will take care of all your agent packages.
[0:00:23] Specifically, it will create agent packages tailored to the monitored system, so they have all the plugins and configuration that you need on that specific monitored system.
[0:00:34] What that means is, for example, if you are using a host tag to identify a server as a MySQL server, then you can create a rule based on this host tag that will enable the MySQL plugin for all hosts with that host tag.
[0:00:49] The agent bakery will then pick up this configuration and automatically create a installation package which you can then install on your Linux, and which will have the plug-in and the relevant configuration baked into it, so you don't need to take care of any further configuration there.
[0:01:06] And this works for all the plug-in that Checkmk delivers. So, it's your go-to place to create agent packages tailored to your systems. Of course, if you have to take these packages and install them manually on hundreds or thousands of hosts, that's no fun, really, so, we have the automatic agent updates.
[0:01:27] This works in a way that we have a plugin that will be baked into the agent and that will pull updates from the Checkmk server whenever there is a new software version or a change in configuration available.
[0:01:40] This is highly configurable but the general idea is that the agents keep themselves updated, so you don't have to take care of them, and it doesn't matter if you have tens or hundreds or thousands of hosts doing this.
[0:01:52] Of course, you also need to think about security in this regard because if you are installing software on hundreds and thousands of servers, you want to be sure that there is no arbitrary code.
[0:02:04] So, what we do is we cryptographically sign the agent installation packages. And when the agent updater plugin pulls these installation packages, it verifies this cryptographic signature and makes sure that it trusts the key and that the signature is valid before it actually installs the package.
[0:02:22] So, if someone would try to inject some arbitrary software there, it wouldn't be installed. Additionally, we strongly recommend to perform the agent updates only through HTTPS, so also the transport layer is encrypted there.
[0:02:39] All right, so much for the theory. Now let's take a look how we can configure this. So, the agent bakery can, of course, be found in our Setup menu. And it's actually never called agent bakery within Checkmk, it can be found in this section here.
[0:02:55] So, if we go there, we see the default configuration because this is a fresh site. There are no changes in here, and we can see we could get installation packages for all the different platforms that we support.
[0:03:07] We see here which hosts will get this agent and we will see a hash which is the unique identifier of this agent configuration. So, now that's really everything there is to see in the pure bakery itself.
[0:03:21] If we change the configuration within the site, then we will get different agent packages and we will see that in a few moments after we configured the updater rule. So, let's get started with the automatic updates.
[0:03:34] We find the relevant configuration in the agent's writer and there on the bottom we have automatic updates. When we move there, we can see a list of prerequisites that we have to meet before we can actually enable automatic updates. So, let's get started by creating a signature key.
[0:03:55] We can simply add a key here, give it a description. That's the main_key. We need to give it a passphrase. I'm using a start one here and we, say, create. And this is your typical public private key pair. So, we simply have a public public and private key.
[0:04:15] The private key is protected by passphrase and that, as I said, protects your automatic updates from malicious interaction. Because only people that know the passphrase will be able to actually create agents and roll them out.
[0:04:33] So, now let's get back to the automatic updates overview. There we can see the next step is the configuration of the update plugin. So, let's go ahead and do that.
[0:04:46] This is also rule based, so you could also create several rules for several use cases, for example. Or you could create several rules that only configure parts of the agent updater, that would also be possible.
[0:04:58] But to get started we are just going to create one global rule here. We want to enable that plugin. We need to provide some information about the update server. And the rest of the information is already filled in, so the site name, and in this case, HTTPS.
[0:05:18] Now we need to add the certificate for HTTPS verification. This is necessary if your operating system doesn't know about the root certificate of your PKI, for example, or of the certificate they're using on the Checkmk server.
[0:05:35] Or if you're using a self-signed certificate, especially in that case, you definitely have to provide the certificate here. If it's a certificate from a public CA, for example. Or the root certificate of your PKI is known to the operating system, then you can skip this step and you don't need to provide that information there.
[0:05:58] So, here we add our certificate. I already copied it to the clipboard, so I can simply paste it. Now we need to decide on a interval for the update check. The default is 1 hour. For this demo, I'm gonna set it to 1 minute, so we quickly see the updates.
[0:06:10] When you're rolling out your agents initially, something between 3 and 5 minutes would make sense here, so the updates don't take too long.
[0:06:23] For a production environment where the agents are rolled out and up and running and you don't have that much changes in your configuration, it might make sense to go with 1 hour, or maybe 6 hours, 12 hours, depends on how quickly you need your configuration of the agents updated.
[0:06:39] That's really up to you to decide what you want to take there. It shouldn't be too small because that just creates unnecessary load on the Checkmk server.
[0:06:48] If you have a proxy between your monitored systems and the Checkmk server, you could configure that here And last but not least, we have the signature keys that the agent will accept, as I explained earlier.
[0:07:01] And here we have the main key, so I'm going to choose this. And with that, we are done with the configuration. I'm not going to use any conditions here because this is my global rule.
[0:07:11] So, I'm going to save this here. And now we see the overview of what we configured. That's fine for the moment. Let's quickly activate those changes and then we can go ahead and bake our agents for the first time.
[0:07:26] So, here we are. You actually can see two buttons lighting up. Because Checkmk realized that we changed a piece of configuration, specifically the updater rule, and now it tells us, hey, we need to bake the agents or to bake new agents because the configuration changed and some hosts will get a new agent.
[0:07:46] So, let's go ahead with bake and sign. I'm using my stored passphrase here, click on back and sign. And now new agent packages are created in the background.
[0:07:56] That's that, quite quickly depending on the size of your installation and the number of hosts. This process will take some seconds but it's quite efficient and shouldn't last too long.
[0:08:08] So, now we can see a configuration was created. It has the agent updater rule in it with all the configuration. And we see our localhost should get this agent. 
[0:08:18] So, let's just go ahead and install the agent  on our localhost on this Checkmk server. What we could do is we could download the package here to our client and upload it again to the server. That would be the manual approach and what you typically would do.
[0:08:33] In my case, because this is the Checkmk server itself, I can simply switch over to the command line and install the agent directly from the site. And now on the Checkmk server, I can simply do a dpkg -i and I will go to my site. So, omd/sites. This is the site. /var/check_mk/agents/. 
[0:09:01] There we have all the agent packages which you already saw on the user interface. As this is a Ubuntu system, I want to have the Linux Debian package. 
[0:09:10] There is the packages folder. And if we take a look here, there are 2 agent configurations, we just saw those. 
[0:09:16] So, I'm going to use the one that was created specifically with the agent updater and just go ahead and install it. So, now we see the installation succeeded.
[0:09:27] The agent is installed. And now there are only 2 steps that we need to take before automatic updates actually work.
[0:09:34] The first thing is we need to register the agent with the Checkmk server, which makes the agent known to the Checkmk server and they can agree on a secret to communicate.
[0:09:45] So, that's what we're going to do here right now. For that, we need the cmk- update-agent, we need the register keyword, and we need to provide the host name as it appears in Checkmk. 
[0:09:59] So, if you're using FQDNs, for example, or however you name them uppercase, lowercase, the important thing here is that this name behind the -H has to fit the name in Checkmk.
[0:10:10] In my case, that's localhost, so that's fine. And now I only need to provide a user that is allowed to register this host.
[0:10:18] In this environment, I'm just using the cmkadmin. You might have a specific user for that in your environment. But this is the bare minimum that I need to provide here. And now I can hit enter.
[0:10:29] I'm asked for the password of the user. And if I provide that, I can see I successfully registered for agent deployment, which is great. Now from a client side we are done.
[0:10:40] And now there's only one last step that we need to take from the user interface. So, if we go back here, we go to automatic updates one more time, and there we can see nearly all topics have  been done. So, we baked and signed agents.
[0:10:58] We just registered one agent for automatic updates. And now we only need to enable the master switch, which prevents Checkmk from doing updates all together.
[0:11:07] And now they are enabled, so all prerequisites are met. And now we can actually do automatic updates. So, if I switch back now to the command line to speed up the process and I'm simply calling the agent updater with the verbose flag, then we can see the agent updater talks to the Checkmk server fetches the fitting agent package for himself and updates the agent.
[0:11:26] So, that's what we see as a result here. Moving back one more time to the user interface on the automatic update page, we can now go to a report.

So, there's several reports here actually about the state of agent updates. So, if  there are issues errors, something that's not right, you will be aware in this view. 

[0:11:52] And there's a view on the bottom that shows all the hosts that are registered, no matter which state they are. So, let's take a look here.
[0:12:00] And there we can see, when we remove the filter, this is our host, here is our local host. We see the state of the host as the first item of information here.
[0:12:12] We see when it last contacted the server, and when the last download was, which is the same at this point because we just run the command. We see the last error that the agent occurred during installation. We see the status of the agent, so which hash should be installed, and which hash is installed. 
[0:12:29] There is a service that's checking for the update status and will tell us in the monitoring if something is wrong. And we get the update check output which is some more details on the update process.  
[0:12:42] So, as you saw the view just updated. So, now we can see the agent. The target agent is the same as the installed agent. That's great.
[0:12:49] But we still see the last error here and we see a warning here that indicates there was an error. The reason is that the update check in Checkmk is asynchronous and it will take a few moments for the Checkmk server side to realize that the state actually is okay.
[0:13:03] So, if you see issues here right after registering your host, just grab a coffee, for example, wait for 5 minutes, and then you will see that everything has been cleared.
[0:13:14] If then you are not in a fully OK state, then there might be something wrong and you would want to investigate. So, now we see the last error has vanished.  
[0:13:24] Just as I said, everything is up to date now and the Checkmk server tells you that. 
[0:13:29] So yeah, this concludes the video for today. Thank you so much for watching. Be sure to subscribe and I will see you around.

Want to know more about Checkmk? Join us for our Introduction to Checkmk Webinar

Register now

More Checkmk Videos