With Checkmk you can monitor ESXi-Hosts and also its VMs. Thus, for example, on the host it is possible to query Disk-IO, datastore performance, the status of physical network interfaces, diverse hardware sensors, and much more. Checkmk likewise offers a series of check plug-ins for the VMs. A comprehensive list of these can be found in Catalog of check plug-ins in the "VMWare ESX" section.
Using the piggyback technique VM data will be displayed directly in its associated host. Thus the VM related data is found right where it is actually required, and where it can be compared to that registered by the VM's OS:
Access to this data is achieved via the HTTP-based vSphere-API – not over the normal agents or SNMP. This means that no agent or other software needs to be installed on the ESXi-Hosts and that the access is very simple to set up. Older systems – from Version 4.1 – are also supported.
2. Setting up
2.1. Setting up via the ESXi host system
The initial setup for monitoring a ESXi-server is very simple and can be completed in less than five minutes. Before you can set up the access however, the following prerequisites must be satisfied:
- You must have defined a user on the ESXi-server. It is sufficient that this user only has read access.
- You must have defined the ESXi-server as the host in Checkmk, and configured it as an agent (Checkmk Agent). Tip: Select the host name so that it is the same as that known to the server itself.
Once the prerequisites have have been satisfied you can create the Check state of VMWare ESX via vSphere Rule. This will be assigned to the defined host, so that instead of the standard agent the Special-Agent will be used for retrieving data from the VMware-monitoring.
Enter the user's name and password as they have been defined on the ESXi-Server. The condition for the rule must be set on the host defined in Checkmk. After this the first installation will be complete and Checkmk can retrieve the data from the server.
Finally, go back to the host configuration, execute a Service discovery, then activate the changes as usual. If no services are identified, you can search for errors in the configuration with the Diagnostic options, as described later below.
2.2. Setting up using vCenter
If a vCenter is available, with this you can also retrieve a lot of your ESXi-environment's data. This method has various advantages and disadvantages:
|Simple application in situations where VMs are assigned dynamically using vMotion.||No monitoring if vCenter is unavailable.|
|Monitoring of a cluster's total RAM usage is possible.||No monitoring of hardware-specific data in the cluster's nodes.|
A combination of both methods can also be utilised – then you can have the best of both worlds.
Configuring the vCenter
Similar preconditions apply for this configuration as for the configuration over a single ESXi-Server:
- A user with read access is present on the vCenter
- The vCenter has been defined as a host and configured as an Agent Check_MK Agent in Checkmk
- If the ESXi-Servers have already been configured in Checkmk and you wish to combine the monitoring, then in vCenter their names will be the same as they are configured as hosts in Checkmk
As described earlier, create a rule for the VMware-monitoring's special agent, in Type of Query select the vCenter, and set the condition to the appropriate host as defined in Checkmk:
Retrieving from ESXi-Hosts and vCenter
In order to avoid duplicated data retrieval when using a combination of both configuration methods, the rule for the vCenter can be configured to retrieve only specific data. One possibility is to access the Datastores and the Virtual Machines over the vCenter, and the other data directly from the ESXi-hosts. The license usage can be fetched in both configurations as the vCenter reports an overall status.
If you have already configured the ESXi-hosts, its rules will be adapted accordingly. Here only access to the Host Systems and Performance Counters is offered, since these belong unalterably to a particular ESXi-server. The license status is applicable only to the accessed ESXi-server.
2.3. Monitoring the VMs
By default, only the status of the VMs as services is created and assigned to the ESXi, or the vCenter respectively. There is however even more information from these VMs – from RAM, or the Snapshots, for example. This data is retrieved over the ESXi/vCenter and stored as piggyback data.
In order to make this data visible, the VM must be defined as a host in Checkmk. You can of course install the Checkmk agents on the VM and take full advantage of their functions. The piggyback data will simply be added to that already available.
Renaming the piggyback data
The host name in Checkmk must be identical to the VM name, so that the data allocations function correctly. The piggyback data's name will be the same as the VM's name on the ESXi. If these names do not match then there are several options in Checkmk to make the piggyback names conform. In the configurations rule itself the following are available:
- From Version 1.4.0i1 it is possible to use the host name of the OS on the VM, if this can be accessed via the vSphere-API
- If the VM's name includes blank characters, the name will be truncated after the first blank. Alternatively, the blanks can be replaced with underscores
If the host's name is quite different in Checkmk, an explicit allocation can be performed with the help of the hostname translation for piggybacked hosts rule.
If the host is configured in Checkmk and the names conform, you can activate the Display VM power state on check box in the configurations rule – select if and where the data is to be made available. Select The Virtual Machine here.
With a service discovery on the host(s) the new services will now be identified and can be activated. Be aware that the information from the services could differ from one another. The ESXi-Server will see a virtual machine's RAM usage differently to how the machine's own OS reports it.
3. Diagnostic options
When searching for the source of an error there are a number of 'ports of call'. Since the data comes from the ESXi-/vCenter-Server, this is a logical place to start searching for the error. Later it is important that the the data gets to the Checkmk-Server, and can be correctly processed and displayed there.
For problems with an ESXi-/vCenter-Server configuration:
With the curl command you can verify whether the server is accessible to the monitoring:
OMD[mysite]:~$ curl -Ik https://myESXhost.my-domain.net HTTP/1.1 200 OK Date: Fri, 4 Nov 2016 14:29:31 GMT Connection: Keep-Alive Content-Type: text/html X-Frame-Options: DENY Content-Length: 5426
Whether the access data has been entered correctly, and Checkmk can access the host, can be tested on the console with the Special-Agents. Use the --help/-h options to receive a complete list of the available options. In the example, with the aid of grep the output was limited to a specific section and the first four lines following it. You can omit this part in order to receive a complete output, or filter for another:
OMD[mysite]:~$ share/check_mk/agents/special/agent_vsphere --debug --user myesxuser --secret myesxpassword -D myESXhost | grep -A4 esx_vsphere_objects <<<esx_vsphere_objects:sep(9)>>> hostsystem myESXhost poweredOn hostsystem myESXhost2 poweredOn virtualmachine myVM123 myESXhost poweredOn virtualmachine myVM126 myESXhost poweredOn
Whether Checkmk can access the host can be verified on the console. Here the output is also limited to five lines:
OMD[mysite]:~$ cmk -d myESXhost | grep -A4 esx_vsphere_objects <<<esx_vsphere_objects:sep(9)>>> hostsystem myESXhost poweredOn hostsystem myESXhost2 poweredOn virtualmachine myVM123 myESXhost poweredOn virtualmachine myVM126 myESXhost poweredOn
Alternatively, the test can also be performed in WATO:
If everything works up to this point the output should have been saved to a temporary directory. If such a file has been produced and if the content is correct can be determined with the following:
OMD[mysite]:~$ ll tmp/check_mk/cache/myESXhost -rw-r--r-- 1 mysite mysite 17703 Nov 4 15:42 myESXhost OMD[mysite]:~$ head -n5 tmp/check_mk/cache/myESXhost <<<esx_systeminfo>>> Version: 6.0 AgentOS: VMware ESXi <<<esx_systeminfo>>> vendor VMware, Inc.
Problems with piggyback data:
Checkmk creates a directory containing a text file for each host. In this text file can be found the data which is to be allocated to the hosts.
OMD[mysite]:~$ ll tmp/check_mk/piggyback/ total 0 drwxr-xr-x 2 mysite mysite 60 Nov 4 15:51 myVM123/ drwxr-xr-x 2 mysite mysite 60 Nov 4 15:51 myVM124/ drwxr-xr-x 2 mysite mysite 60 Nov 4 15:51 myVM126/ drwxr-xr-x 2 mysite mysite 60 Nov 4 15:51 myESXhost2/ OMD[mysite]:~$ ll tmp/check_mk/piggyback/myVM123/ -rw-r--r-- 1 mysite mysite 1050 Nov 4 15:51 myESXhost
If these directories or files are absent they have not been created by the Special-Agents. You can see if the VM's data is included in the agent's output. Should this situation arise, look in the configuration rule for the ESXi-/vCenter-host to see if the data retrieval has been activated.
OMD[mysite]:~$ grep "<<<<myVM123>>>>" tmp/check_mk/cache/myESXhost <<<<myVM123>>>>
In the case of a very large number of such directories for piggyback data it can be very difficult to find those that have no allocation to a host. Here we provide a script with which unassigned piggyback hosts can easily be found:
OMD[mysite]:~$ share/doc/check_mk/treasures/find_piggy_orphans myESXhost2
From the script output it can be that Checkmk can't find a host with the same name to which it can allocate the data. The piggyback names can however be altered in a number of ways.
4. Files and directories
|tmp/check_mk/piggyback/||WATO saves the piggyback data here. For each host a subfolder is created with the host's name - this subfolder contains a text file with the host's data. The filename is the name of the host providing the data.|
|tmp/check_mk/cache/||Here the respective latest agent output from all hosts is temporarily saved. The content of a host's file is identical to the cmk -d myhost command.|
|share/check_mk/agents/special/agent_vsphere||The special agent for executing a query of ESXi- and vCenter-servers. This script can also be executed manually for testing purposes.|
|share/doc/check_mk/treasures/find_piggy_orphans||A script for finding piggyback data that is not allocated to a host.|