Werk #16563: Ensure background jobs finish properly on stopping sites

Component Core & setup
Title Ensure background jobs finish properly on stopping sites
Date Jul 9, 2024
Level Prominent Change
Class Bug Fix
Compatibility Compatible - no manual interaction needed
Checkmk versions & editions
2.4.0b1 Checkmk Raw (CRE), Checkmk Enterprise (CEE), Checkmk Cloud (CCE), Checkmk MSP (CME)

Previously running background jobs were not properly cleaned up by omd stop. Those processes were terminated at the end of omd stop command which would clean up the processes in most cases, but lead to failed jobs and leave crash reports behind.

Secondly a few background jobs depend on Redis. In case Redis is stopped while the jobs are still running, the jobs would fail and also leave a crash report behind.

This change aims to solve both issues, by extending what happens during omd stop. The procedure roughly works like this:

  1. First the apache process and site cron are stopped to prevent the start of new background jobs. This is done by the already existing logic.
  2. The new init script background-jobs gives the jobs some time to finish. Ideally all jobs are stopped after 20 seconds.
  3. The stop command does not have to forcefully kill the jobs anymore in this case.
  4. As a last resort omd stop will terminate the still running jobs as before.

To the list of all Werks