Checkmk Conference #6 goes digital. Get your tickets here!
1. Why own checks?
Checkmk already monitors many types of relevant data using a large number of its own standard check plug-ins. Nevertheless, every IT environment is unique, so that often very specialised requirements can arise. With local-checks you have a facility for quickly and easily creating your own services suitable for meeting these requirements.
These local-check plug-ins differ in one significant aspect from other checks: the calculation of a status occurs directly in the host on which the data is also retrieved. In this way the complex creation of checks with plug-ins in Python is not needed and there is thus a completely free choice of coding language for scripts.
2. Writing your own simple scripts
2.1. Preparing the script
A local check can be written in any programming language supported by the target host. The script must be constructed so that each check produces a status line consisting of four parts. Here is an example:
0 myservice myvalue=73;80;90 My output text which may contain spaces
The four parts are separated by blanks and have the following meanings:
|1.||Status||The status of a service is given as a number: 0 for OK, 1 for WARN, 2 for CRIT and 3 for UNKNOWN. As an alternative it is also possible to compute the state dynamically.|
|2.||Service name||The service name as shown in Checkmk. This may not contain blanks.|
|3.||Metrics||Performance values for the data. More information about the construction can be found later below. Alternatively a minus sign can be coded if the check produces no metrics.|
|4.||Status details||Details for the status as they will be shown in Checkmk. This part can also contain blanks.|
There must always be a blank character between the individual parts of the output and the first text of the status detail. Everything following will then count as status details, which is why blank characters are allowed.
If there is uncertainty about a possible output, it can be simply tested by writing a small script with the echo command. Insert the output to be tested into the echo command:
#!/bin/sh echo "0 myservice - OK: This is my custom output"
For Windows hosts such a script will look very similar to this:
@echo off echo 0 myservice - OK: This is my custom output
By the way – you can write any number of outputs in a script. Each output line will have its own service created. How it can be checked whether the local script will be correctly invoked by agents can be seen in the Error analysis.
2.2. Distributing scripts
Once the script has been written it can be distributed to the appropriate hosts. The datapath used will depend on the operating system. A list of paths can be found below.
Don't forget to make the script executable on Unix-based systems. The path shown in this example is for Linux:
root@linux# chmod +x /usr/lib/check_mk_agent/local/mylocalcheck
2.3. Adding the service to the monitoring
At every invocation of the Checkmk agent the local check will also be executed and appended to the agent's output. The Service Discovery also functions automatically like with other services:
Once the changes have been activated, the implementation of the self-created service with the aid of a local check will be complete. Should a problem arise during the Service Discovery, the Error diagnosis covered later below can be of help.
3. Extended functions
3.1. Using metrics
With a simple local script metrics can also be transferred. The syntax is as follows:
The values are separated with a semicolon. If a value is not required just leave the field blank:
Please note that the values for min and max in the
In principle, values up to the 'value' data itself are optional and can be omitted.
Multiple metrics can be produced by a local check – in such a case they are separated by a | (pipe):
A complete output with multiple metrics will look like this:
root@linux# /usr/lib/check_mk_agent/local/mycustomscript 0 myservice count1=42|count2=21;23;27|count3=73 OK - This is my custom output
The graphs will subsequently be generated automatically in Checkmk:
3.2. Multi-line outputs
The option to spread an output over multiple lines is also available. Because Checkmk runs under Linux you can work with the Escape Sequence '\n' in order to force a line-break. Even if due to the scripting language the backslash itself needs to be masked, it will be correctly interpreted by Checkmk:
root@linux# /usr/lib/check_mk_agent/local/mycustomscript 2 myservice - CRIT - This is my custom output\\nThis is some detailed information\\nAnd another line with details
In the service's details this additional line will be visible:
3.3. Caching outputs
Local checks can be cached like normal plug-ins. This can be necessary if a script has a longer processing time. They will then only be executed according to a defined interval, buffered, and then this cache will be appended to the agent's output. By the way, under Linux or another Unix-based system every cached plug-in can be executed asynchronously. For this create a subdirectory whose name matches the number of seconds the local check's output is to be cached. In the following example the local check will be executed only every 10 minutes (600 seconds):
root@linux# /usr/lib/check_mk_agent/local/600/mylocalcheck 1 myservice count=4 WARN - Some output of a long time running script
Under Windows a local check will be handled exactly like other plug-ins: Enter the cache_age for the local check into the check_mk.ini:
[local] cache_age mylocalcheck = 3600
Alternatively, under Windows the caching can also be configured in the Agent Bakery.
3.4. Calculating status dynamically
As seen above, with metrics the thresholds can also be displayed in the graphs. Could these thresholds also be used for a dynamic calculation of service status? Checkmk provides exactly these options for extending a local check.
If instead of a number, the letter ‘P’ is passed, the service's status will be calculated on the basis of the threshold as provided. An output will then look like this:
root@linux# /usr/lib/check_mk_agent/local/mycustomscript P myservice count=40;30;50 Result is computed from two values P myservice2 - Result is computed with no values
The output in Checkmk differs in two points from the output that we saw earlier:
- The individual metrics, as they are seen in the views will be appended to the output, separated by commas. Thus it can always be seen if a status has had a value calculated for it.
- If no metrics have been passed the service's status will always be OK.
Here's how the output from the example shown above looks in a service view:
Upper and lower thresholds
Some data has not only an upper threshold but also a lower threshold. An example is humidity. For such cases the local check has the option of providing two WARN/CRIT values – these are separated by a colon and represent the upper and lower thresholds:
4. Distribution via the Agent Bakery
If you want to distribute a local check to multiple hosts, and you already use the Agent Bakery, the bakery can also be used for distributing scripts. As the instance user create the directory custom on the Checkmk-Server in ~/local/share/check_mk/agents/. For every group of local checks a subdirectory will be created in this directory:
OMD[mysite]:~$ cd ~/local/share/check_mk/agents OMD[mysite]:~/local/share/check_mk/agents$ mkdir -p custom/mycustomgroup/lib/local/
The lib-directory flags the script as a plug-in or as a local check. The following directory then allocates the file explicitly. You can then also save the local check in this.
Important: You can use the asynchronous execution in Linux as you know this from the regular plugins. In Windows the configuration is made as always in the check_mk.ini.
Thereafter mycustomgroup will be shown as an option in WATO. Using the Host & Service Parameters ➳ Monitoring Agents ➳ Generic Options ➳ Deploy custom files with agent in WATO create a new rule and select the newly-created group:
Checkmk will then autonomously integrate the local check correctly into the installation packet for the appropriate operating system. After the changes have been activated and the agent baked, the configuration will be complete. Now the agents only need to be distributed.
5. Error analyses
5.1. Testing a script
If you run into problems with a self-written script, the following potential error sources can be checked:
- Is the script executable, and are the access permissions correct?
This is especially relevant if you are running the agent or script and you are not the root/system user.
5.2. Testing the agent's output
On the target host
If the script itself is correct, the agent can be run on the host. With Unix-based operating systems such as Linux, BSD, etc., the command below is available. With the -A option the number of additional lines to be displayed following a success can be specified. This number can be customised to suit the number of expected outputs:
root@linux# check_mk_agent | grep -v grep | grep -A 3 "<<<local>>>" <<<local>>> 0 myservice count1=42|count2=21;23;27|count3=73 OK - This is my custom output P myservice2 - Result is computed with no values P myservice3 humidity=27;40:60;30:70 Result has upper and lower thresholds
Under Windows the output can be diverted to a text file, in which the expected outputs can likewise be searched for in the local-section using Notepad, for example. As appropriate, substitute the path shown below for the installation path used for your own Checkmk installation:
C:\Program Files (x86)\check_mk\> check_mk_agent.exe test > out.txt
On the Checkmk server
As a last step the the processing of the script output can also be tested on the Checkmk server. Once for the service discovery:
OMD[mysite]:~$ cmk -IIv --debug --checks=local myserver123 Discovering services on myserver123: myserver123: 3 local
And also the processing of the service output with a similar command:
OMD[mysite]:~$ cmk -nv --debug --checks=local myserver123 Check_MK version 1.4.0p15 myservice OK - This is my custom output myservice2 OK - Result is computed with no values myservice3 CRIT - Result has upper and lower thresholds, humidity 27.0 < 30.0 (!!)
If there are errors in a local check, Checkmk will identify them in a service output. This applies as well for erroneus metrics, for false or incomplete information in the script output, or an invalid status. These error messages should aid in quickly identifying errors in a script.
6. Files and directories
6.1. Script directories on the host
|/usr/lib/check_mk_agent/local/||Linux, Solaris, OpenBSD and OpenWRT|
|%PROGRAMFILES(X86)%\check_mk\local||Windows (Agent until version 1.5.0)|
|%PROGRAMFDATA%\check_mk\local||Windows (Agent starting with version 1.6.0)|
6.2. Cache directories on the host
|/var/lib/check_mk_agent/cache/||Linux and Solaris|