Ep. 37: Monitoring Windows services with Checkmk
[0:00:00] | Welcome to the Checkmk Channel. Today we're taking a closer look at Windows service monitoring. |
[0:00:15] | If you add a Windows host to Checkmk, you already get a lot of services. |
[0:00:20] | So, you can see the state of your CPU, the utilization file systems, a lot of services that tell you a lot of about the state of your Windows server. But in certain use cases, you need specific information about specific Windows services. |
[0:00:36] | We already have a service summary by default, which tells you if some service failed unexpectedly, so you get the information in the summary service. |
[0:00:47] | But as I said, in special use cases you want to have a closer look at specific services. |
[0:00:53] | For example, if you have a domain controller or an MS SQL database server, you might want to monitor the services that run to enable these applications to run. |
[0:01:03] | So, you want to be aware of them distinctively, and to be able, for example, to alert specifically on that service and not just on the summary. |
[0:01:13] | So, that's the general idea. Let's take a look at how we configure that in Checkmk. So, here we are looking at a typical Windows Server. |
[0:01:21] | You can see all the services I already mentioned, CPU utilization. I actually added some plugins here, so there is some more information than by default. |
[0:01:30] | But if we scroll down a little, we see the service summary over here and there we can see Autostart services: 58, Stopped services: 2, which is okay because our service summary assumes that if a service is an Autostart, it has to be started. |
[0:01:46] | If it is stopped, then it's okay because a stopped service has been stopped by a administrator or by someone who does that intentionally. |
[0:01:55] | Only if a service is in a failed State, will Checkmk show a critical State here and tell you which service failed. So, that's what we have. Now take a look at how we get more insight on specific services. |
[0:02:09] | To do that, we go to the Setup menu and search for service disco. You could search for service discovery, but the short form suffices here. |
[0:02:23] | And there we have a rule says Windows service discovery. There we can add a new rule. |
[0:02:29] | And now, for simplicity reasons, I will simply add a Regular Expression that will add all the windows services here. |
[0:02:37] | So, of course, you could do several Regular Expressions depending on what you want to monitor. I said, if there are specific use cases, you will be aware of which services are relevant here. |
[0:02:47] | But I would simply use all of them to just showcase what we can discover here. |
[0:02:52] | The two next options should be used with caution. The first is Create check if service is in state. |
[0:03:00] | That doesn't mean that we want the services to be running, but that means Checkmk will only discover the services if they are running. |
[0:03:07] | So, if there's a critical service for you, that's stopped at this point and you enable this option here and set it to Running, that stopped service will never be discovered. |
[0:03:17] | So, that's something important to keep in mind. So, this option is there for specific use cases, but, in general, you don't want to check the state of the service here. |
[0:03:28] | The second option, Create check if service is in start mode, might make more sense because the default here is we only want services that are in the start mode Automatic. |
[0:03:38] | Because if we take a look here, there's also Manual and Disabled, but a service that is disabled will never run, so it might not make sense to monitor those or it makes special sense to monitor those that depends on the specific use case. |
[0:03:52] | Also, for Manual Services, there are services that are in a manual startup state, but that will be started by an application, for example, and should be running all the time. |
[0:04:03] | In that case, you might want to go with this option. Generally, it might make sense to just not give this option and just to discover all the services that match your regular expression. |
[0:04:15] | In my example here, I'm going to choose the Automatic startup mode because that seems like the same default as long as the service is in automatic startup then I would want to monitor it. |
[0:04:27] | I'm not going to limit this to any host because this is my default for all the windows hosts. Of course, you could create more specific rules. |
[0:04:35] | So, let's save that here real quick and let's directly jump to our example Windows host. |
[0:04:43] | And if we now take a look at the Service configuration page, then we can see there's a lot of new services. We discovered 58 new services, that's all the Windows services in autostart in that Windows operating system. |
[0:04:57] | And here you can see a lot of services, for example, there is DFS Replication, there is the DHCP server, which would be one use case, for example, to monitor a DHCP server. |
[0:05:10] | We have the DHCP plugin to monitor the pools, but here we would have the Windows services to be aware of them. |
[0:05:16] | Same goes for the DNS Service, for example, or something really central like the Lanman Services here, that are relevant for a active directory joint servers. So, there's really a lot to see. |
[0:05:27] | So, let's just add all of those to the monitoring, just for this example, and activate changes here. |
[0:05:41] | And if we now take a closer look at our Windows host, we take a few seconds before the services have been checked. So, let's just refresh here real quick. |
[0:05:56] | So, this is our Windows host, and if we now take a look here, we have all the services and monitoring and we can see in the detailed outputs here. For example, it is running and the start type is auto. |
[0:06:08] | So, the later one is what we configured in the rule. We only want to see the services that are in auto start and here we see this specific State. |
[0:06:15] | And if one of the services were stopped now, we would see that it was critical. Actually, there are some sub services here. |
[0:06:22] | We can see this RemoteRegistry Service. It says it stopped, but it's in startup type auto. |
[0:06:28] | So, from my configuration, I'm already made aware of a service that isn't running, although it should be running. |
[0:06:33] | So, I could investigate here. All right, that's how you get the services into monitoring and you see we already have the defaults, so as long as the service is running it's okay. |
[0:06:44] | If it is stopped unexpectedly, then it's in a critical state. And of course, you can configure that. |
[0:06:52] | So, if we go to Parameters for this service here and take a look at this rule set, then we can create a rule and now we can tell Checkmk what to do with the service. |
[0:07:03] | So, the first rule just tells Checkmk to discover the services and to be able to add them to the monitoring. |
[0:07:12] | And this rule now enables us to configure, for example, the Services states. |
[0:07:18] | What is the resulting state for Checkmk? You can see we have the Expected state, we can define the Start type here, and we can define the Resulting state of that combination of sets, so you can, in a very detailed way, configure which state the services should be in. |
[0:07:36] | And there is several more options, like what to do if none of this service state entries match here. |
[0:07:44] | You could add custom icons to the service, for example, or you could use a alternate name for the service if the service just doesn't speak too much to you. |
[0:07:55] | So, there's a lot of configuration options on how to monitor the services. |
[0:07:58] | I'm not going to add anything here because as you saw the defaults are quite same and should work for most use cases. |
[0:08:04] | So, that concludes the video for today. You learned how to monitor Windows Services. |
[0:08:09] | You saw that it's quite easy. You just need one rule to monitor them and potentially if there is a special use case, you might need another rule to create some different thresholds, but that's really all there is to do for Windows service monitoring. |
[0:08:24] | So, that concludes the video for today. Thank you very much for watching. Be sure to subscribe and I will see you around. |
Want to know more about Checkmk? Join us for our Introduction to Checkmk Webinar