Ep. 23: Working with the event console in Checkmk
Read Video Transcript
|[0:00:00]||Welcome back to the Checkmk channel. Today, we are going to talk about the event console.|
|[0:00:16]||The event console is used for all kind of event-based monitoring. As you know of now, all we did is in active monitoring. So, we querry systems and get a result.|
|[0:00:28]||On event-based monitoring, it's that some other systems is sending on a civil event, that can be a syslog server, that can be, like, SNMP traps or basically every other protocol which you can use then to convert it to a real event console event.|
|[0:00:46]||As you can see here, the event console is alongside with Checkmk but is using the same Setup.|
|[0:00:54]||So, that means it's a completely different system but it's fully integrated inside from Checkmk not only for the setup part but also for the notification.|
|[0:01:05]||This means from the event console you can even trigger like a normal email notification. The event console has even more features.So, you can even define your own actions inside the event console.|
|[0:01:23]||And what I'm going to show you later, you can also define different outcomes with this action. So basically, you can not only open events, you also can automatically close them by a closing event by a time.|
|[0:01:36]||You can also use events just to trigger an action and close them immediately. After that, you can also automatically close events by time. As you can see on diagram 2, the event console has different interfaces to receive events.|
|[0:01:55]||For example, you can enable receivers for syslog UDP and TCP and for SNMP traps. Also, there is a local interface which you can use to build your own plugins, which enables you to receive any kind of event from every system you want. Then let's look into it.|
|[0:02:18]||We start at the part Setup. The first thing you need to do is enable the full view of all options. Then you're going to find in the part, 'Events', the subpart: Event Console.|
|[0:02:40]||On the next page then, you see the Event Simulator. You can use this then to test your rules. And on the second part, you see the Rule packs. Rule packs are basically a collection of rules.|
|[0:02:58]||You can go with just one pack and use them to collect all of the rules. You can even export these packs if you want to.|
|[0:03:08]||I would recommend to create a pack for every type of rule you want to have. For example, if you want to monitor the events of a, let's say, Cisco switch, I would create a pack for Cisco switch.|
|[0:03:22]||If you have, like, a different system just create one pack for this different system. The next thing we need to do is to enter the rule pack. That works with this icon.|
|[0:03:37]||And on the second page, now we still have our event simulator. I can hide it since we don't need it anyway now, and yet no rules. With the button Add rule, we can add our first rule.|
|[0:03:54]||The first option we need to set all the time is the Rule ID. This is important because every created event will refer later on to this full ID so you always know which of your rules created which event.|
|[0:04:12]||I'm really creative so I use test here as an event ID. Let's switch to the Matching Criteria. I just gonna use the important ones here.|
|[0:04:23]||The most of them are related to syslog events but we can also do just a little text match. A event always needs a pattern.|
|[0:04:37]||And it doesn't matter if you talking about syslog events here or even SNMP traps.|
|[0:04:42]||The pattern can be that you have very simplified error at first, and after that an error text like Error: Something is wrong.|
|[0:05:01]||Of course, you don't want to create one rule for every possible error. So, we're gonna need regular expression to match everything was behind error. It's like (.*).|
|[0:05:22]||Yeah, but this is a really simple example. In practice the regex would be a bit more complicated.|
|[0:05:34]||Then the next problem with this event is when we receive it, it's created but when should the event closed again? That can be automatically after time.|
|[0:05:48]||That can be automatically, it can be automatically immediately because you can also assign actions. I show this later. Or you can also receive a clearing trap or a clear message. For that, you can use text to cancel events.|
|[0:06:06]||And now it gets a bit complicated since usually it's, like, you get the OK notification. And after that, you normally have the original error message. So, you need to build the same match also inside this field.|
|[0:06:33]||This background here is, if you get an error one and an errow two and then you get an OK, you would not know which one of the errors you want to close. Is it the first one or is it the second one?|
|[0:06:46]||But if you get error one, error two, and then later OK two, you would know you need to close the second one, the one who had two as the error message.|
|[0:06:57]||Therefore, these brackets need to match the original error message to close the correct event. And that is already enough that we can test it.|
|[0:07:12]||I save now. I activate my changes. And now let's use the Event Simulator to test. So, we go back to Setup, Event Console.|
|[0:07:38]||I open the event simulator and what I need is the message text. I said I have something with error. Let's say something wrong again.|
|[0:07:56]||I can enter some more source information but that's not important for our test now.|
|[0:08:01]||At first, I can go with Try out. With that I can show that my rule pack is matching even when I'm inside the rule pack.|
|[0:08:12]||With this click here and I go Try out. I can see that my test rule is matching here.|
|[0:08:22]||But when I go to Generate event, I see here my sidebar, that event is created. I click inside it.|
|[0:08:33]||And see here, for myhost089, Foobar-Daemon, Error: Something wrong. Then we want to close it automatically. I use browser back in this case.|
|[0:08:51]||I change to OK. I generate the event. And a second later here, the sidebar isn't refreshed yet but I can see it here in the overview, the event is gone.|
|[0:09:10]||And then another example about our regular expressions. I'm gonna create two events. Error: Something wrong 1. Error: Something wrong 2.|
|[0:09:28]||Now I have these two events here and I just want to close one of them. So, let's go with the first one, with 1 here. Generate. And we just have one left.|
|[0:09:49]||The next thing we can do, we can make it look a bit more, let's say, clean. So, we can rewrite message field and some of the other fields.|
|[0:09:58]||So that we go back to the configuration, we go back inside the rule. I'm here in the rule pack and here I have my test rule. So, I use the pencil. I scroll down.|
|[0:10:15]||And at the very end of the page we have the rewriting part, where we can rewrite the message text or information like hostname and stuff like that. And it's basically really simple.|
|[0:10:28]||You go with the matching patterns from before. Here we have one pattern.|
|[0:10:33]||If I would have a bit more complicated text like error and something else and then a second of this, I would have two matching groups.|
|[0:10:50]||So here, I would be able to refer to the first one and refer to a second one in another field.|
|[0:11:00]||In our example, I keep it simple. So, just one. So, we don't have a number two and let's put something there. It's like, say, Hello World.|
|[0:11:18]||Then we want to have the error. We save it again. I activate it. Then let's create the next event. Setup, Event Console.|
|[0:11:34]||Instead of Error: Something wrong 1, now we're gonna have, wait for it, Hello World: Something wrong. So, the part error is gone because it wasn't part of the matching pattern, and we have the nice rewrited text here.|
|[0:11:55]||Finally, I want to show some more of the options for our rules. Back to Setup. Back to the Event Console. I enter into my rule pack and I enter just in this rule.|
|[0:12:12]||Besides the matching criteria, you can define the outcome. For example, one rule can be that the rule pack is skipped or that the event is just dropped.|
|[0:12:29]||This makes sense if you have a lot of events in a short time and you don't want to have millions of events possessed by all of your rules, so you can sort out events at any time.|
|[0:12:47]||Next point, you can change the State. So, normally it's set by syslog but especially, if you work with SNMP traps, you need to decide is it critical, is it okay, is it unknown, is it running.|
|[0:13:03]||Also, you can assign contact groups if you want to use the right management in Checkmk. And finally, you can assign some actions.|
|[0:13:14]||The simplest action is to send the monitoring notification that you can use the rule-based notification system of Checkmk.|
|[0:13:22]||Otherwise, you can also define your custom actions inside the Event Console. But it's not the part I'm gonna show today.|
|[0:13:35]||Then you have also two parts of actions. You can have an action the moment that the event appears, but you can also have an action the moment when the event is cancelled.|
|[0:13:50]||If you don't have the cancelled event, you can also automatically delete the event after the action is done. For example, you send the monitoring notification and then you just delete the event.|
|[0:14:07]||But sometimes you get a lot of messages. For this, you can use Counting & Timing.|
|[0:14:13]||So, you can also decide count messages in an interval, and say, you want to wait for, let's say, 5 events before you create something.|
|[0:14:26]||So, you would need, for example, 5 SNMP traps of the same type before you create the alert in the Event Console. And then you can also check for a heartbeat.|
|[0:14:42]||So, you can, say, you want to have the event if you don't receive anything. So, if you don't get error every, what is the default here, every hour, you get the error message. Or what you can do is delay the event creation.|
|[0:15:05]||So, you can, say, I want to delay it for 15 minutes. And if you get the cancelled event before these 15 minutes, there will not be an event in the Event Console.|
|[0:15:16]||And one really really important option here is the limit event lifetime. Use this please every time when you test something so that events are automatically deleted. Because the Event Console is not an archive of some type.|
|[0:15:40]||It's just there to give you notifications if you have any errors. So, if you would fill the Event Console at some point, it's going to break.|
|[0:15:45]||One last thing, the Checkmk Event Console can also help you to receive events because the console has integrated receivers for syslog and SNMP traps.|
|[0:16:03]||To set them up, you need to go to the site settings. So, we go to the Setup, the Global settings. Then we can search for Site Management, Event Console. Or I prefer putting Event Console to the filter and directly getting it here as a result.|
|[0:16:38]||By default, it's only the local processing but you can enable, for my example here, the SNMP traps, and save that and activate that. After that, Checkmk will automatically listen on this port.|
|[0:17:00]||The only thing you need to care about is that only one side can listen to one of the ports so it's not possible to have multiple sites receiving this kind of traps.|
|[0:17:10]||I also often got the question: if we can handle SNMP MIB files here for the traps? Yes, it's possible.|
|[0:17:21]||I'll show you how it's done. Here in Setup, then Event Console. Then in event console, you have a part SNMP MIBs. There you can upload new MIBs.|
|[0:17:43]||And second of all, you need also to enable the handling on. Therefore, in settings you have the option to enable the drop translation.|
|[0:18:04]||But I would not recommend to do that because there is not a real benefit out of it.|
|[0:18:11]||You still need to do everything by hand, so you need to define your patterns, and it's easier to define a pattern out of numeric OID and real error messages than of just text.|
|[0:18:30]||So, it's easier for you to catch, like, your pattern from a combination of OIDs and text.|
|[0:18:36]||Also, of course, if you use the MIB files, it's going to be slower than without of them, because Checkmk would need to pass every incoming SNMP trap against all of the uploaded MIB files.|
|[0:18:36]||I hope this little insight in this powerful system was helpful. Thanks for watching and please subscribe.|
More Checkmk Videos
Ep. 1: Installing Checkmk 2.0 and monitoring your first host
In this video, Baris explains how to take get started with Checkmk and start monitoring your first host within a few minutes.
Ep. 2: The Checkmk 2.0 user interface
In this video, Baris take you through the new user interface in Checkmk 2.0. He explains the various components of the User interface such as the new navigation menus, the Sidebar, main dashboard, tactical overview, how to switch between the Checkmk interface themes and much more
Ep. 3: Using SNMP to monitor network devices in Checkmk 2.0
In this episode, Baris explains how to monitor network devices with Checkmk. SNMP is a protocol that many switches, routers, printers, UPSs, hardware sensors and other devices have implemented with the purpose of being able to monitor them easily.
Ep. 4: Monitoring Windows in Checkmk
In this video of our Getting started with Checkmk series, Baris explains how to install a Checkmk agent on a Windows host system and add that into your monitoring environment.
Ep. 5: Using metrics and graphs in Checkmk 2.0
In the 5th episode of the Getting started with Checkmk series, Baris explains using various metrics that you can monitor in Checkmk such as CPU utilization, CPU load etc. You can also see graph visualizations for these metrics or create and customize your own as per your requirements.
Ep. 6: Updating Checkmk 2.0 and using multiple instances
In this video, Baris explains how to update your Checkmk instance. It is very easy and can be done within minutes. You can run multiple Checkmk instances with different versions on the same system. This gives you the flexibility to test the new version before using it in production.
Ep. 7 (part 1): Working with rules and setting thresholds in Checkmk
In the following three-part videos series, Baris explains rule-based monitoring with Checkmk. In the first part, he shows you how you can work with rules and set threshold values. Rule-based configuration is one of the key features for Checkmk which helps you to scale your monitoring easily within minutes.
Ep. 7 (part 2): Smart rules with Host Tags in Checkmk
In the second part of this video, Baris explains using Smart rules with host tags in Checkmk. In the first part, he shows you how you can work with rules and set threshold values. These are features that you can use to build your rules even more intelligently and to better organize your monitoring.
Ep. 7 (part 3): Managing Hosts in Folder in Checkmk
In this final part of our episode on Rule-based monitoring in Checkmk, Baris demonstrates how to manage hosts in folders in Checkmk. This helps you to apply your monitoring configurations at scale and organize your hosts according to your needs.
Ep. 8: Working with Host and Service Groups in Checkmk
In this Baris demonstrates how to create host and service groups in Checkmk, so you can perform actions on an entire group instead of configuring each of them individually.
Ep. 9: Using the Quicksearch function in Checkmk
In this episode of the Checkmk tutorials, Baris shows how you can use the Quicksearch function in Checkmk. You can use it to easily find and manage certain hosts or services. He also explains some examples of filters to you. In Checkmk 2.0 you can use the same syntax in the Seach function found in the monitor menu to get identical results.
Ep. 10: Detecting configuration errors with the Analyze Configuration feature
With the Analyze Configuration feature, you can check if there are any configuration errors in your installation. Checkmk controls a number of possible security risks or potential performance restrictions and indicates if there are any problems.
Ep. 11: View creation and customization in Checkmk
In this video, Baris demonstrates how to customize headers, columns, and more in Views in Checkmk for yourself or other users. He also explains how to create custom views and add desired information to these views.
Ep. 12: Acknowledging problems in Checkmk
In this video, Baris explains how you can acknowledge problems in Checkmk. This function helps you to qualify the states of hosts and services. This allows you to keep track of messages in the main dashboard and, for example, you can add comments to problems.
Ep. 13: Scheduling downtimes in Checkmk
In the episode of our Getting started with Checkmk series, Baris explains how you can manage the maintenance times of your systems in Checkmk. Such scheduled downtimes prevent your monitoring from sending false alarms when a host or service goes to WARN or CRIT during maintenance work. You can also inform the users concerned about the maintenance via Checkmk.
Ep. 14: Distributed monitoring with Checkmk
In this video, Baris explains how you can connect several Checkmk instances to a monitoring system and then manage it.
Ep. 15: MKPs and Plugins in Checkmk
In the 15th episode of our Getting started with Checkmk tutorial series, Baris explains what are Checkmk Extension Packages (MKPs) and how easy it is to integrate them into your Checkmk monitoring environment. MKPs are the preferred format when you make your own extensions as it makes it easy to share with other users or deploy in distributed environments.
Ep. 16: Working with 'Bulk Actions' in Checkmk
In this episode of our Checkmk tutorials series, Baris explains how you can save a lot of time with bulk actions. With this feature you can perform various tasks such as deleting, renaming, service discovery etc. on a large number of hosts simultaneously.
Ep. 17: Working with network topologies in Checkmk
In this video of our gettign startted with Checkmk series, Baris explains how to map network topologies in Checkmk. This feature is quite helpful to manage your network and prevent any unnecessary notifications from the devices in your network.
Ep. 18: Creating and customizing dashboards in Checkmk
In this video of our Getting started with Checkmk series, Mathias explains how you can create and customize dashboards in Checkmk 2.0, so you can get insights into your monitoring according to your requirements. Find out more in this video.
Ep. 19: Monitoring websites and their certificates with Checkmk
In this episode, Bastian demonstrates how to monitor a website and its certificate with Checkmk. You can also monitor specific web pages with Checkmk by using the several options that will suit your use case. Learn more in this video.
Ep. 20: Configuring dashboard elements in Checkmk
Learn how to add data visualization elements of the various metrics into your Checkmk Dashboard. In this video, Mathias explains how you can configure these elements and create a dashboard as per your requirements.
Ep. 21: Setting up notifications in Checkmk
Learn how to set up notifications in Checkmk and assign relevant contacts and contact groups to be notified for various events. Later in this video, our presenter Bastian also demonstrates how you can set up rule-based notifications according to different conditions for hosts and services.
Ep. 22: Monitoring logfiles with Checkmk
Monitor your logfiles with Checkmk using its Logwatch plugin. It is very useful when you want to monitor your logfiles regardless of whether you are using a UNIX/Linux or a windows based system. Learn more in this video.
Ep. 24: 3 Rules for efficient network monitoring
In this video, Bastian demonstrates 3 rules that will help you to efficiently monitor your network interfaces. With Checkmk 2.0, with just three rules, you can set up an efficient network monitoring that will not only monitor all of your network interfaces but also simultaneously provide a detailed overview of all of your ports.
Ep. 25: New UX and security improvements in Checkmk 2.1
Checkmk 2.1 come with many UX improvements such as pre-built dashboards for Linux and Windows, faster core performance and much more. Security features such as two-factor authentication etc. were also added in this new version. Watch this video to learn how to use these new features and enhancements in Checkmk.
Ep. 28: Working with InfluxDB integration in Checkmk
Learn how to send data to InfluxDB from Checkmk. As InfluxDB introduced a new protocol to send data to it, a new connector was developed with Checkmk to talk natively with it. Learn more about it in this video.
Ep. 29: New agent architecture in Checkmk 2.1
With Checkmk 2.1, the agent architecture was modified to enable performance improvements and add new features such as TLS encryption, data compression, and the reversal of direction of communication from the agent. This will enable push mode and pull mode.
Ep. 30: Clustering the Checkmk appliance
In this video, Robin demonstrates how you can cluster your Checkmk appliance to make it resilient against hardware failures. If you are using the Checkmk hardware appliance, it may be helpful to cluster your appliance to maintain high availability.
Ep. 32: Working with the Agent bakery in Checkmk
In this video, Robin demonstrates how to roll out agent packages with the required configuration for different monitored systems using the agent bakery in Checkmk. The "Automatic agent update" is quite a helpful feature as it pulls the latest configurations for an agent automatically and you don't need to manually update all of your agents deployed on different systems.
Ep 33: Monitoring Docker containers with Checkmk
Learn how to monitor Docker containers with Checkmk.In this video, Robin demonstrates the process of setting up a rule to configure the docker plugin and bake an agent with the desired settings for the Docker host.
Ep 34: Introduction to Checkmk Ansible collection
Last year the Checkmk Ansible collection was created to interact with the Checkmk REST API. In this video, Robin demonstrates how you can use this Ansible collection to automate your monitoring with Checkmk.
Ep 35: Monitoring SQL databases with Checkmk
In this video, Robin demonstrates how you can configure your Checkmk site to monitor your SQL databases. As there are many flavours of SQL databases, the process is mostly the same.