Werk #8275: Alert Handlers - execute automatic actions upon state changes of hosts and services

Component The Checkmk Micro Core
Title Alert Handlers - execute automatic actions upon state changes of hosts and services
Date Jun 16, 2015
Checkmk Edition Checkmk Enterprise (CEE)
Checkmk Version 1.2.7i2
Level Major Change
Class New Feature
Compatibility Compatible - no manual interaction needed

Check_MK now supports automatic actions (scripts) to be executed upon the state change of a host or service. This is similar to Nagios "Event Handlers" but has a much more flexible configuration and other advantages.

At the beginning there is a state change of a host or service. It does not matter whether this change is "soft" - because the maximum number of check attempts has not been reached. It simply matters that the state has changed from one of OK/WARN/CRIT/UNKNOWN to another.

Whenever this happens a new global rule chain of Alert Handler Rules is being processed. Each rule that matches calls an external script of your choice. Most times people want to restart services, trigger garbage collections of Java machines or do similar stuff.

Please note that some folks insist that monitoring should not try to repair things or by any other means actively change things. Whether you share this opinion or not is your own decision. Alert handlers do not limit you in what you exactly do with them. But you have been warned.

When you compare alert handlers with the Rule Based Notifications (RBN) then here are some important differences:

LI:Notifications are being suppressed during downtimes, outside of the notification period, when the host is down and other situations. Alerts are never suppressed. LI:Notifications cannot be triggered at soft state changes. LI:Alert handlers only work with the Check_MK Micro Core. If you need Nagios or Icinga please use the Event Handlers of those cores. LI:Alert handling rules do not allow cancelling. All matching rules are being executed. LI:As long as no alert rule is defined the alert handling mechanism is deactivated in the core.

Note: this is just the first implementation of the Alert Handlers. Next steps will the introduction of error tracking, notifications tied to alert actions, even more flexible conditions, a system for secure remote execution and much more. Stay tuned!

Setting up Alert handlers

For setting up alert handlers you first need to create a script that should be called. This can be written in any programming language - most people will use a simple BASH script. It must be executable and be installed in ~/local/share/check_mk/alert_handlers and made executable.

The script is provided with all information about the alert with environment variable that being with ALERT_ - very similar to a notification script. A good start for testing is the following script:

/local/share/check_mk/alert_handlers/foo

#!/bin/bash
env | grep ALERT_ | sort > /tmp/alert.out

This will dump all the variable of the alert into the file /tmp/alert.out. When specifying this script in the alert handler rule - simply write foo here.

Useful for debugging is to set Alert handler log level to Full dump of all variables and Logging of the alert processing to on in the global settings. You will find information it ~/var/log/cmc.log and ~/var/log/alerts.log.

To the list of all Werks