Werk #18278: Agent updater/non-root agent: Lockfile race condition
Component | Agent bakery | ||||
Title | Agent updater/non-root agent: Lockfile race condition | ||||
Date | Sep 2, 2025 | ||||
Level | Trivial Change | ||||
Class | Bug Fix | ||||
Compatibility | Compatible - no manual interaction needed | ||||
Checkmk versions & editions |
|
As also mentioned in Werk #18271, the agent updater deployed with the non-root agent runs in two different modes: The update mode, that runs under the agent user as a plugin, and the install mode, that runs under root.
Since the agent updater is not allowed to perform sytem or network I/O from multiple processes at the same time,
this is prevented by holding a lock on a PID file.
When blocked by another process, the agent updater will wait up to 10 seconds on the locked PID file before giving up.
In case of an agent updater already running under root, an agent updater running under the agent user failed to wait,
but crashed immediately, because it couldn't open the root-owned lockfile.
This lead to delayed agent updates because the agent updater would only retry on the next update cycle.
The agent updater will now instead wait on an existing PID file even if it can't be opened.