In recent years the container concept has taken the IT world by storm. That has of course thrown up questions about the monitoring of such containers. From Version 1.5.0 Checkmk can monitor Docker containers directly via the Linux agent. But Checkmk monitors not only the general status of the daemon or the containers, but also the containers itself. A full list of the elements that can currently be monitored can be found in Catalogue of the Check plug-ins.
Alongside the status and inventory information which Checkmk can determine over the node (docker-jargon for 'the host on which the containers are running'), Checkmk can also determine detailed status information for the containers. For this every container has to be added as a seperate host in Checkmk if the container is to be monitored. Its data will be piggybacked from the node to this host.
From 1.6.0 of the
2.1. Installation of agent and plug-ins
To be able to monitor a Docker node with Checkmk, it must first be monitored with the normal Linux Agents. This will as usual give you a basic monitoring of the host system, however there will be no information about the Docker daemon or about the container.
You will need the following Agent-Plug-ins:
- In version 1.5.0 the two plug-ins mk_docker_node and mk_docker_container_piggybacked
- From version 1.6.0 only the mk_docker.py plug-in is needed
Install the plug-in or plug-ins as usual to /usr/lib/check_mk_agent/plugins.
- In version 1.5.0 the Docker node and Piggybacked Docker containers rule sets
- From version 1.6.0 the Docker node and containers rule set
Please note, starting with 1.6.0 the docker Python library is required (not docker-py). At least version 2.0.0 is necessary (you can easily check this by entering python on the command line):
root@linux# python Python 2.7.16 (default, Sep 24 2019, 22:49:21) [GCC 8.2.0] on linux2 Type ‘help’, ‘copyright’, ‘credits’ or ‘license’ for more information. >>> import docker >>> docker.version '4.0.2'
If required you can install the library with pip:
root@linux# pip install docker
Attention: The packages, docker-py or python-docker-py respectively, must not be installed. These make an outdated and incompatible version of the Docker library available under the same namespace! If docker-py (or both variants) have been installed, a single uninstall is not enough because pip cannot fix the namespace. To ensure that the correct version is installed, please execute the following commands in this case:
root@linux# pip uninstall docker-py docker root@linux# pip install docker
If you now perform service discovery in WATO and activate the changes, you should find some new services that affect the Docker node itself (here from version 1.6.0):
2.2. Finetuning the plug-in
As of 1.6.0 you can configure different parameters of the plug-in. For example you can save resources by deactivating unnecessary sections or, if required, by customizing the Docker API Engine endpoint (the default is the Unix socket unix://var/run/docker.sock).
As usual, create the configuration file /etc/check_mk/docker.cfg. A template with detailed explanations can be found in the Checkmk directory share/check_mk/agents/cfg_examples/docker.cfg.
2.3. Monitoring the container
Creating the container hosts
Of course the interesting aspect is the monitoring of the Docker containers. This happens automaticall by installing the plug-ins. However the services will not be assigned to the docker node, rather Checkmk assumes a single host per docker container.
The mechanism used here is called piggyback: The plug-in or special agent transports data of other hosts ‘piggybacked’ so to speak along. Checkmk places this data in the tmp/check_mk/piggyback directory. All you have to do in WATO is to create hosts with the correct names, and the services will then be automatically assigned to them.
From version 1.6.0 of the
- The host name must exactly match the directory created in tmp/check_mk/piggyback. By default, this is the 12-digit short ID of the container (for example, 2ed23056480f)
- If the containers do not have their own IP addresses (which is usually the case), set IP-Address-Family to No IP.
- For Data sources be sure to set Check_MK Agent to No agent.
- You can set the Parent field to the host name of the Docker node.
- It is also important that the Docker node and its container are monitored from the same Checkmk instance.
Once the container hosts have been created, and after performing a service discovery, new services appear on these.
If you have a Linux Agent installed in the container, it will be executed automatically. However since many services monitored by the agent within the containers actually show information from the node (for example, CPU load, temperature and many other operating system parameters), these were removed with version 1.6.0.
Alternative names for container hosts
By default – as mentioned above – the 12-digit short ID for the container is used as the name for the container host.
This can optionally be configured differently. To do this, in the configuration file
set the container_id option to long in order to use the complete container ID as the name,
or to name in order to use the container name.
Additionally with the rule set Access to agents ➳ General settings ➳ Hostname translation for piggybacked hosts you can define fairly flexible rules with which hostnames – which are contained in the piggyback data – are converted to generate better host names for Checkmk. With this method you can also solve the problem of having containers with the same name on two different Docker nodes, for example. Using appropriate translation rules you could then, e.g., add a prefix to the names to make them explicit.
Monitoring the host's status
Since a container's host status cannot really be verified using TCP-Packets or ICMP, this must be determined in another way. The Docker container status service facilitates this – in any case it checks whether or not the container is running, and can thus be used as a secure tool for detecting the host's status. Define a rule in the Host Check Command rule set for this purpose, and set the Use the status of the service option to the mentioned service. Don't forget to set the conditions so that only containers are affected. In our example all containers are located in a folder with the same name:
Operating the agent directly in the container
To monitor details in the container itself (e.g., running processes, databases, log files, etc.), it is necessary that the Checkmk agent in the container itself is executed. This is especially true for the roll out of agent plug-ins. If you do not have an agent installed in the container, up to version 1.5.0 of Checkmk an agent will be automatically executed in the container by the agent installed on the node as soon as you monitor the node with Checkmk.
Since this method has proven to be not very performant, from version 1.6.0 it is necessary to use the normal Checkmk agent installed directly in the container to get a more detailed monitoring of the container. The three plug-ins, mem, cpu and diskstat (Disk I/O) work without an agent in the container and are calculated by the Checkmk agent on the node itself.
Especially for self-created Docker images you might want to roll out the agent itself into the container. In this case the data is no longer calculated – as described above – by the agent of the Docker node. Instead of this a separate agent runs in each container. Calling this agent will still be bundled in a piggyback procedure via the Docker node however.
However the agent installed in the container only works if all necessary commands are also present in the container. Especially with minimally-built containers based on Alpine Linux it could very well be that elemental things such as bash are not present. In such a situation you should monitor the container from the Docker node.
3. Diagnostic options
3.1. Diagnosis of a Docker node
Should the setup not be successful, there are a number of options for analysing the problem. The Checkmk-Agent supports Docker monitoring from Version 1.5.0. Verify therefore that an agent with at least this or a later version is installed on the host.
If the version of the agent on the host is high enough, next check if the data is present in the output. The output can be downloaded as text data using the Download agent output option of the Host Dropdown menu in the GUI:
Alternatively, you could search the Agent-Cache directly. For clarity the output in the following example is abreviated to the output for the node:
OMD[mysite]:~$ strings tmp/check_mk/cache/mydockerhost | grep "<<<docker" <<<docker_node_info>>> <<<docker_node_disk_usage:sep(44)>>> <<<docker_node_images>>> <<<docker_node_network:sep(0)>>>
If the sections are not shown here, the Docker installation will not be recognised. In version 1.5.0 the following command is used for the Docker node info service. This command must be executable in exactly this form on the host system. If necessary, check your Docker installation:
root@linux# docker info 2>&1
3.2. Diagnosis for a container host
If the container host receives no data, or respectively, no services are detected, first check if piggyback data is available for this host. The host's name must be identical to the ID of the container. Alternatively, using the hostname translation for piggybacked hosts rule set you can also perform a mapping manually. Here, however, only the Explicit hostname mapping option is available:
To verify whether piggyback data will be created for an ID, you can check the following directory:
OMD[mysite]:~$ ls -l tmp/check_mk/piggyback/ 76adfc5a7794 f0bced2c8c96 bf9b3b853834
4. Host labels
From version 1.6.0 of Checkmk there are so-called host labels. The redesigned Docker monitoring automatically sets the three labels cmk/docker_image, cmk/docker_image_name and cmk/docker_image_version. You can use these labels, e.g. in conditions for your rules, to make your monitoring configuration dependent on the image used in a container.
5. Files and directories
|tmp/check_mk/piggyback/||WATO stores the piggyback data here. For each host a subfolder with the host's name will be generated. This contains a text file with the host's data. The filename is the host that supplied the data.|
|tmp/check_mk/cache/||Here the most recent agent output from all hosts is saved temporarily. The contents of a host's file is identical to that from the cmk -d myserver123 command.|