Werk #15417: mk_logwatch: Correctly handle UNIX paths with non-UTF-8 characters

Component Checks & agents
Title mk_logwatch: Correctly handle UNIX paths with non-UTF-8 characters
Date Mar 3, 2023
Level Trivial Change
Class Bug Fix
Compatibility Incompatible - Manual interaction might be required
Checkmk versions & editions
2.3.0b1
Not yet released
Checkmk Raw (CRE), Checkmk Enterprise (CEE), Checkmk Cloud (CCE), Checkmk MSP (CME)
2.3.0b1 Checkmk Raw (CRE), Checkmk Enterprise (CEE), Checkmk Cloud (CCE), Checkmk MSP (CME)
2.2.0b1 Checkmk Raw (CRE), Checkmk Enterprise (CEE), Checkmk Cloud (CCE), Checkmk MSP (CME)

On UNIX, filenames may consist of arbitrary bytes.

To display filenames in Checkmk, the mk_logwatch plugin assumes a UTF-8 encoding and previously replaced each non-UTF-8 (hence non-printable) byte with the replacement character �.
This lead to the problem that multiple filenames were represented by the same name in Checkmk.
As a result, the logfiles got displayed in one service, with the logfile contents merged to one log view.

In order to handle this situation, mk_logwatch now uses a backslash escape sequence instead of the replacement character.
Non-UTF-8 bytes are now represented by their hexadecimal value with a \x prefix.
E.g., monitoring the two files (as shown with ls, i.e., with octal escape sequence) 'my'$'\300\201''file' and 'my'$'\300\200''file now yields two services Log my\xc0\x80file and Log my\xc0\x81file, while it used to yield only one service Log my��file before.

Due to the new service name, you have to discover the newly named services if you are affected by this change.

To the list of all Werks