Ep. 8 (part II): Smart rules with Host Tags
Note: All the videos on our website offered in the German language have English subtitles and transcripts, as given below.
[0:00:10] | We'll now continue with the second part of our trilogy on rules and service parameters. |
[0:00:15] | And now we come to the topic that I have already promised you, namely the so-called Host Tags, and also the labels. Here I want to cover the subject that I now want to formulate rules, according to their type. On all of my production database servers I have very specific thresholds for the CPU usage. To formulate such a rule, I first have to ask: A server is productive – what does that actually mean? To do this, I give the server either a host tag or a label. We will see the differences between these in a moment. Labels is a feature that was only introduced in Version 1.6. |
[0:00:50] | Let's start with the Host Tags. The idea is that when I create a host in the configuration, I tell each host what tags it has, and then I can refer to these tags later. For example, I can say for every server that it is production system or it is a test system. And later I can say: This rule applies to all test systems. So, let's just take a look at how that works. As you can see, I have now added four more hosts to the monitoring – namely dbserver01, dbserver02, 03 and 04. |
[0:01:23] | What I do now is to say that servers 1 and 2 should be productive servers, and that 3 and 4 should be test systems. |
[0:01:32] | For this I now go into the administration of the hosts, and start with the dbserver01, and go into the properties. |
[0:01:42] | I'm going to close the Basic Settings Network Address and Data Sources items. What I am interested in are the custom attributes, and you can now find the host tags which are already predefined by Checkmk. |
[0:01:57] | These are only examples however. |
[0:01:59] | I'll take criticality, and here there are various selection options. I can say that it is a production system or a test system. So this is a tag selection, and it is already predefined. |
[0:02:13] | You see, a tag works like this – I have a name for the so-called tag group, like criticality, and there are simply different choices, and each host simply has one of these values. |
[0:02:24] | I now choose production system here, then I go to the second, and also say production system. By the way, this is the default value, as you can see here in brackets, so I really shouldn't have to do anything for the production systems. |
[0:02:40] | Nevertheless, I just set it explicitly, then I go to the third one, and I say that this is a test system. And I do the same with the fourth one - so. By the way, where we are at the moment, there is also the possibility of doing an action in several hosts at the same time. |
[0:03:10] | For example, I can activate the checkboxes here, and could have said, here, with three and four, I would like to edit them at the same time, and I can now say here that they should both now be the test systems. I would have saved myself a bit of work. So, as soon as I have activated the change, these hosts will be marked with these properties, but I could have saved myself that as well, because no further action follows from these host tags. |
[0:03:43] | The monitoring runs the same as before, but what we can do now is to define new rules based on these host tags. We will do that now. |
[0:03:51] | The next step that we will take is to now create one rule, – or more specifically, several rules – because I want to specify now that the CPU usage on the test system doesn't really matter, or I set it at 100%, which is permitted. |
[0:04:07] | In production systems, I set the threshold to 80% because I say that I just want to have a little more headroom. |
[0:04:14] | Well, I choose to start, as seen in the last episode, via one of the database servers, for example, and just go straight to the service – that's easiest way – let's look at CPU utilization – which is a percentage utilization. |
[0:04:29] | This has nothing to do with the load, and I now go back here via the parameters. |
[0:04:36] | Here we can see that there is no threshold yet, there is no parameter at all, so I create a rule, go back to Create host specific rule, but this time in the conditions I now clean up a bit. So what we have is the explicit hosts. I can take that out right away and instead I go into the Host Tags and now select the tag group from Criticality, and add a condition. |
[0:05:13] | By the way, from Version 1.6 this mask now looks a bit different, so if you're working with Version 1.5 please don't be surprised. With 1.5 it is a bit easier, but Version 1.6 is much more flexible. |
[0:05:27] | So I choose criticality and say that I want to add a condition for it. And now this new line appears, and here is Criticality is Productive system. Here I could also just choose the other options. |
[0:05:41] | I will leave it as productive, and set the thresholds for the productive systems. Now go back to Value, and go to total CPU utilization. You can now see a large number of different options. |
[0:06:00] | I simply set fixed levels at 80% and 90%. These are now the threshold values for my productive systems. |
[0:06:13] | And now I'm going to create a second rule. I can do this, for example, by simply copying this rule here, changing the thresholds to, let's say 101%, which can never be achieved, and for the condition I just change that into the test system. Save. |
[0:06:35] | Now to the question of the order of the rules. Well, that doesn't matter here, because the conditions are mutually exclusive – a host can only ever be either a test system or a production system. |
[0:06:47] | That's why it doesn't matter in which order I do this. By the way, we are still in the rule editor via the host dbserver01, as we have already determined that it is a production system, and therefore its first rule is marked green. |
[0:07:02] | You can also see the conditions here: Host: Criticality is Productive system. And this is where the threshold values of 80% and 90% come from. And the rest is as before – I activate the changes, and with this I have completed the configuration and used Host Tags for the first time. |
[0:07:19] | So, we have now used a predefined host tag group called Criticality. This is actually only in there as an example. |
[0:07:27] | We recommend not to use these sample groups, but instead to create your own groups, and for this there is the Tags WATO module. In this module you can now see the predefined tag groups. These are also classified as builtin. These first two, as I said, are only examples. You can also delete them, which is why they are not declared as built-in now. I'm now going to create a completely new tag group, and each group has an internal ID, which must be unique – it also must not contain any spaces – so a typical ID field, and the second is the title, which you can name as you like. |
[0:08:11] | I now want to create a group for the type of application. I simply want to say that there are web servers, database servers, and other things. This is not a very realistic example, but I think it is illustrates well enough what this is about. |
[0:08:25] | That is why I am entering here in the Tag group ID – 'application'. In Title – 'Application' – I use a capital 'A' so that you can see the difference. |
[0:08:38] | The topic is actually only there to decide the location where this tag group is to be visible later in the properties of the host. If you want to, you can create your own tag, i.e. your own Topic, which I could call 'My Own Tags', for example, so that we can then recognize that we have defined them ourselves. So, each tag group needs at least one, or more possible selections. So here I select 'Add tag choice' – I do that three times. |
[0:09:17] | The tags themselves must also have IDs, and these must also be unique across the tag groups. So, here I enter, for example, Tag ID 'web', Tag ID 'db', Tag ID 'Other', and for Titles enter 'Webserver', 'Database Server', and for the third I enter 'Other'. The Auxillary tags – just forget them, it is very unlikely that you will need them, and you can also alter the sequence using these crosses here, but that shouldn't matter to us for now. I save the whole thing, and have now just defined my own tag group. |
[0:10:06] | There is another rule that a host always gets as the first entry from every tag group, unless you explicitly specify it. That means that all hosts that are now present are declared as web servers, and this will apply until I assign a different tag to them. |
[0:10:25] | That would of course be a reason to say that we are pushing 'Other' upwards, so that we have now defined that if we do not select 'Webserver' or 'Database Server', then 'Other' always applies, which would actually make a bit more sense in this case. |
[0:10:41] | The tag group is now there, each host has a tag from each group, but this action has not yet had any further effects. We can then achieve this by applying rules again. |
[0:10:55] | The next step is that now we look at our host number and assign the correct tag to the host. To do this, I now go back to my host administration, and I just want to go to these four servers, all four of which are database servers, so I simply want to assign the Tag Database Server. |
[0:11:13] | The easiest way is of course again with the check boxes – then I don't have to select each one individually. |
[0:11:22] | So I select these four items – 1-2-3-4, go here to Edit, and see the topic My Own Tags down here – as I said earlier, I have created my own topic in Topics. |
[0:11:38] | In this case, the topics are these boxes, and I say I want to define an Application – and not as Other, but instead that these are Database Servers. |
[0:11:49] | So, I have edited these four hosts simultaneously. The next step is that now I again look at my rules again. We simply do the following – the rule that we just created, or the two rules – we will change these in a way so that they will only access database servers. |
[0:12:07] | I will now show you another way for how you can find rules. |
[0:12:11] | If you go directly to the Host & Service Parameters module there is a button up here that called 'Used rulesets'. |
[0:12:18] | Here there are all rule sets in which there is at least one rule. A few of them are already filled-in by the new Checkmk system, which is why there are a few more entries in it, but we can find our two rules for CPU utilization relatively quickly. |
[0:12:36] | If you know roughly what the rule is called, it is of course even easier. Then you can simply search on the site. For example, 'Utilization' or 'util' – I mean, of course that there can be a whole bunch of hits, but here I can also see relatively quickly that mine is the only rule set where I have two rules, and then I come to the rules that I defined earlier. Another way would be of course to go over the service in the way that we spent all of that time before. So now I edit the two rules by simply adding an additional condition to each rule. |
[0:13:11] | So here I now select the tag group – which here is now My Own Tags/Application – add a condition to it, and select Database Server as Application. Save. |
[0:13:30] | In the overview I can see that a line has been added to the Conditions. I do that with the second rule as well. Application is Database Server. Save. |
[0:13:49] | So now we can see these two rules: one rule for productive systems, the other rule for test systems. |
[0:13:54] | And with that we have found a very elegant method for setting general rules for a large number of hosts, so that when you now add a new host to the monitoring, and then assign the correct tags to it, you will automatically get the correct configuration. |
[0:14:09] | This is the advantage with the Host Tags. |
[0:14:13] | A very similar concept to host tags is: Labels. |
[0:14:18] | The labels were introduced with Version 1.6, and have a similar function, but but in contrast to the Host Tags, the Labels are freeform – you do not have to predefine anything. You can simply add any labels to any host, and then define conditions later. |
[0:14:36] | It is – so to say, a bit 'quick and dirty' – it is easy, only then of course there is no control of whether you have made a typing error when creating a host, for example. Labels can also be generated automatically if you monitor Docker, for example, or with certain cloud services Checkmk will automatically adopt labels from hosts there. |
[0:14:56] | Let's take a look at how one can manually assign labels. |
[0:15:00] | To give labels to the hosts, you don't have to predefine anything at all, that's the nice thing about labels - it is very simple - I just go back to my Hosts and can simply say, for example, for Database Server 1, here in Custom Attributes, at Labels activate the Checkbox and enter 'app:web' for example. |
[0:15:23] | With the labels it is important to know there must always be a colon. |
[0:15:26] | What in the case of the host tags, is the group and the value, with Labels it is simply the word in front of the colon, which is in effect the Group, and there can only be one value from each group. |
[0:15:38] | In this example it is 'app:web', and if I hit 'Enter' I can add a second label. For example, 'test:123', or whatever I feel like. If I now save here, you will also see in the list of hosts that these labels are visible there as so-called explicit labels, which means that I defined them by hand. |
[0:16:01] | And now I can of course formulate conditions for this. |
[0:16:07] | I can say now, for example, if the label test:123 exists, then something special should apply. |
[0:16:14] | So I go back into my rules from earlier, which we have already edited several times, and take, for example, the first one and see that here I can now add a label condition to the host labels, and then say here that it has the label 'test:123'. If you have that, then these rules apply, or I can say, it doesn't have it. |
[0:16:51] | It is important to know that if you assign several conditions here, the so-called unlinking always applies, so that the rule only applies if all of these conditions are actually fulfilled at the same time. |
[0:17:04] | So, you can already see the difference between labels and tags. Labels are very simple, as I don't need to explain, I can just use them directly, but they have the disadvantage that there is no protection if I make a typing-mistake. |
[0:17:16] | This means that if you need very precise control, each host can only have exactly one of the following selections – you want to enforce that in fact – you also want it so that when when colleagues add hosts they cannot get anything wrong, and if you also need a very clear structure, then Host Tags are the right tool. If you either want to work quickly and easily, or you want to use the automatic labels that Checkmk automatically recognizes, then the labels are of course an interesting alternative. |
[0:17:44] | And of course you can also use both methods together at the same time. So, now we're done with the second part. |
[0:17:51] | In the third part I will show you how to manage your hosts in folders. |
Want to know more about Checkmk? Join us for our Introduction to Checkmk Webinar