Checkmk Conference #6 is coming! Learn more about it here!
You can find the shiny new documentation here, which is replacing over time this one.
Thus, this article is obsolete and may be not valid anymore - however, the new one is not finished yet!
1. How to handle errors in checks
The way of handling unexpected situations is a crucial point in good software. It should not only make sure that all errors are correctly handled but also keep the actual code as simple and readable as possible.
For that reason Checkmk makes use of Python exceptions. The general rule is
- No error handling is done in the checks.
Or lets state it otherwise:
- The check should assume that the agent is always sending valid data.
Let's make an example: If the agent is excepted to send the three values name, size and usage in each line of output, then the check should not try to validate this but simply does:
name, size, usage = line
If the agent now sends an invalid line with too few or too many values then Python will raise an exception. That exception will be handled by Checkmk in a very sensible way. It makes the check return UNKNOWN, add an according error message to the check output and logs a detailed error message (debug_log is set to a filename in main.mk).
Please note: In cases where the check expects the agent to send broken ouput in some cases and is nevertheless able to execute the check correctly should handle this. This is sometimes the case for SNMP based checks. Some devices do not support all OIDs so they are missing or empty in the info table provided for the check.
2. saveint and savefloat
In the special case of numbers two helper functions are available. If you convert strings into numbers and the string is empty, the Python functions int and float would raise an Exception.
The functions saveint and savefloat simply ignores exceptions and return 0 and 0.0 in that case.