Ep. 7 (part 2): Smart rules with Host Tags in Checkmk

To load this YouTube video you are required to accept advertising cookies.

[0:00:00] Hello and welcome back to the second part of our trilogy about rules and service parameters. In the first part, we talked about working with rules and setting thresholds. And in this second part, I will take a look at host tags and labels, and how you can use them to create rules for very specific groups of hosts. 
[0:00:19] For example to target the CPU load of all database servers in a production environment.
[0:00:36] Before we can create these specific rules we first have to be able to indicate whether a host is, for example in a production or a test environment. And we do that by assigning them host tags or labels.
[0:00:47] The difference between the two you will learn in this video. But for now, let's focus on host tags. 
[0:00:53] So the idea is that when we are creating a host we immediately assign them certain tags or labels.
[0:01:02] For example, if it's a test or a production system and later on when we create rules we can say okay only apply this rule when the host has a certain tag for example "the test environment". Let me show you how that works.
[0:01:16] As you can see I've added four more hosts to our system.
[0:01:21] These are just dummy hosts for illustration purposes. For now, let's go to Setup and Hosts, as you can see they all have the same IP address but as I said they're for illustration purposes so that doesn't really matter.
[0:01:36] Now let's edit the first one "dbserver1". And we're only interested in this custom attribute section, so let's collapse the rest.
[0:01:47] So as you can see there are some host tags pre-configured already by Checkmk.For example, this one the Criticality, and you see that there's there are also some values already pre-configured like 'Productive system', 'Test system'.
[0:02:04] But we will pick 'Productive system' for now. Every host tag consists out of a Host group which is this first part and the second part on the right, the Value. Now let's save and go back to folder.
[0:02:21] And also now let's do the same for the second one. We're going to also make that a productive system. As you can see the production system is the default value but we're going to explicitly set it.
[0:02:35] So save and go back to folder and now the second two so 'dbserver3' and '4' we want to make a test system. And instead of repeating myself twice, I will just pre-select or select both of them and go to hosts and edit attributes.
[0:02:52] Now I will pick Criticality and, Test System > Save. And I've added both at the same time. Okay, for now, we can just activate the changes.
[0:03:09] And now these tags have been applied to all our database hosts and we can define some rules based on these Host tags. And the next thing we're going to do is create a rule or multiple rules because I want to configure that I don't really care if the CPU utilization on the test systems reaches 100%.
[0:03:30] But on the production systems, I want to get warned when they reach over 80%, so I want to leave some headroom. Now let's go to one of our hosts. So, Monitor > all host. And let's pick 'dbserver1'.
[0:03:45] And we're interested in the CPU utilization. So let's go to parameters for this Service. As you can see there are no rules set yet, and we're going to create one now. So we are going to create a whole specific rule.
[0:04:05] And under conditions we can remove the explicit host so we are not interested in specifically the dbserver1, but rather in host tags, so we're going to add a host tag. Namely the criticality that we just set, and then we create a condition so click on the button add tag condition. And now it says criticality equals or is, 'Productive system'.
[0:04:36] And you can pick another value here but I'll leave it at 'Productive system' for now. And now we can set the actual values so undervalues we are going to pick levels on total CPU utilization.
[0:04:55] And we're picking fixed levels and we're going to change this to the Warning at 80%, and Critical at 90%. And now we can save it and create a rule for our test systems, and the easiest way to do that is by duplicating this one and then instead of 80% and 90% we are going to set it to 101% and 101% which are both levels which cannot be reached so we should never see a Warning or a Critical state.
[0:05:39] Then the only thing left to do is set this to 'Test system' and then we can save the changes. Now you can see we have 2 rules here this one is green because we're still working on dbserver1 which is a productive system so these conditions match.
[0:06:00] And now we can just simply activate the changes. And now we are using our host tags in combination with Service rules.
[0:06:15] Like I said before these stack groups like Criticality are more of a pre-configured example in Checkmk. So we recommend that you do not use these but rather that you create your own and that's what I'm going to show you next.
[0:06:29] Under setup, you can find an item called Tags and here you can find all pre-configured tag groups. Some are built-in like these 4. but these 2 are more of an example and that's why you can also delete them. I'm going to add a new tag group.
[0:06:48] Every tag group has a required ID and a Title, the ID has to be unique and cannot contain any spaces the title can be whatever you want it to be.
[0:06:58] What we want to do is we want to create a tag group that illustrates or tells what kind of application the host is. So, for example, is it a web server or a database server and other. It doesn't cover everything here but it will illustrate what we want to showcase. Okay, so now for the tag group id we're going to use 'application' (lowercase) and for the title also 'Application' (but with a capital A).
[0:07:31] Now this topic, and this tells you or this shows you when you're configuring a host and which section the tag will appear. So I will create my own 'MyCustomTags'. That way I can easily identify all my own customly created tags from all other tags.
[0:07:55] Now every tag or tag group has one or multiple tag choices. And these are the values which you can pick from when you are configuring a host.
[0:08:06] So every tag also has a tag ID and these have to be unique within this tag group and I will make three choices one will be web, this will be 'db' for 'database' and 'other'.
[0:08:29] Now the title will be web servers 'Web Server', 'Database Server' and once again 'Other'. The auxiliary tags we can forget for now.
[0:08:46] You can change the order simply by dragging these up and down. but let's save. So now every host will get a default value for the application tag that we just created.
[0:09:03] And you see here that would be webserver that would mean that every host would be classified as a 'Web Server' so in our case it's probably best to set that to 'Other'. So let's quickly change that. So here we're going to drag other to the top, and Save.
[0:09:23] And now every host will be classified as other and only if we explicitly set it to something else, it will be a 'Web Server' or 'Database server'.
[0:09:33] Now we've created the tag but we are not actually using it.
[0:09:37] So now the first thing we have to do is configure our host so they are using the tag. So let's go to setup and host and we want to change these four database servers so let's check them all here and edit them all at once. So hosts and then edit attributes And now you see this new section here my custom tags and the default value is 'other' and we want to set that to 'Database server'.
[0:10:09] So let's save this and activate all our changes. And now all our database servers are classified as such.
[0:10:25] Next thing to do is. go back to our rules and look at the two rules we previously created and edit them in such a way that they will only apply to database servers.
[0:10:34] Let me quickly show you another way to find rules if you go to setup and then rule search. Then here under rules and then used ruleset.
[0:10:47] Here you have an overview of all rulesets that have at least one rule in them. Most of them have been pre-configured by Checkmk so that's why there are already a quite a few in here. But if you know what you're looking for you can rather quickly find what you need. So but of course if you know what you're looking for by name then you can simply filter on it. So CPU. and now we are at CPU utilization.
[0:11:16] So now let's edit both these rules so they will use the new tag we just created.So let's click on edit and then under condition we can add a second host tag.
[0:11:29] And we will add, 'application' and the value will be 'Database server'.And we can save, and as you can see there is now a second condition here.
[0:11:46] So a host has the criticality has to be 'Productive system'.
[0:11:51] And the application has to be, 'Database Server'. Now let's apply the same thing to the test system. So once again let's pick 'application'.
[0:12:05] And add the condition database server and as you can see now we have 2 rules. Both for database servers one for the productive environment and one for the test environment. And with that, we have found an elegant way to apply general rules to a large number of hosts.
[0:12:26] So now when you add a new host you just need to make sure that you set the right tags or the values for the tags in order for the right configuration to be applied to it. And that's the advantage of a host stack. Now a similar concept to host tags are 'labels'.
[0:12:42] But in contrast to host tags, labels are freeform so that means you can add any number of labels to a hos,t but and then later create rules that apply to these labels.
[0:12:56] The only disadvantage is that there is nothing protecting you from making typos when adding these labels. Labels can also be automatically imported by Checkmk.
[0:13:06] For example when you monitor docker or some cloud service, Checkmk can import these labels and apply them to your host when you create them.
[0:13:15] So now let's add some labels manually. So to add a host label we don't have to pre-configure anything we can simply go to setup, hosts and let's add one for 'dbserver2' for example.
[0:13:30] Now hereunder custom attribute, you have labels so if we check that we can add a label. For example 'app:mysql', enter now it's important to know when you add a label you have to add a colon. And they are written sort of as a key-value pair and what is before the colon, the key is similar to a group when you add a host tag.
[0:13:59] And what comes after the colon is the value.
[0:14:04] And you can only add one label per group or per key so if I now were to add 'app123' then that should fill but I can add 'test123' for example. Let's save this and go to folder.
[0:14:26] Now we can also let the labels appear here, in this view to do that you go to display And check this box show explicit host labels now you can see the labels here in the last column.
[0:14:42] I can also create rules based on these labels. So, for example, let's edit the rule for CPU utilization to do that let's go to: all hosts > dbserver2 > CPU utilization. Let's edit that.
[0:15:04] And now edit this rule. So here we can now add a label condition so we can say has some label for example, 'app:mysql'. And if I now save it you see that there is now a third condition. So this rule will now only apply when it's a database server and it has the criticality tag, 'productive system', and it has the 'app:mysql' host label.
[0:15:40] Now you know the difference between host tags and labels. Labels are very easy to apply and you don't need to pre-configure anything but they have the disadvantage that they do not protect you against typos. So if you have a very strict setting or very strict settings and you want to only apply rules to hosts with very specific properties and you want to have the safety that colleagues do not make typos when adding labels then host tags are probably the right choice for you.
[0:16:09] But if you like living on the edge or you want to use the labels that Checkmk automatically recognizes and apply for you then labels can be an interesting choice. And of course, you can use both at the same time like I just showed you.
[0:16:24] So that was it for this part and in the next part, I'm going to show you folders and how you can use them to group your host together. If this video was helpful to you, please drop a like and subscribe to the channel. See you in the next part.

Want to know more about Checkmk? Join us for our Introduction to Checkmk Webinar

Register now

More Checkmk Videos