At the Checkmk Conference #6 Marcel Schulte shared best practices for setting up your own Checkmk staging environment. We recommend with the Gold Standard and the Silver Standard two options for trying out updates before rolling them out in your production environment.

The Gold Standard is the precise way, but it has the disadvantage of being very resource-intensive. The user needs to build a clone of his production environment for this. This means that there is a corresponding system in the test environment for each Checkmk site that has the same performance data. This test option can only be implemented if the user has the appropriate hardware resources. In addition, with the Gold Standard, you must note that in such a situation that the productive and the staging environment both monitor the hosts, which leads to additional load on the hosts as Marcel stressed at this point in his presentation.

Marcel (right) at his presentation about staging-environment in Checkmk.
Marcel gave five recommendations to avoid mistakes for testing and changing versions.

On the other hand, the Silver Standard is suitable for users without sufficient resources. The Silver Standard is not based on a copy of the production environment, but rather a simulation of the data. This data can be queried from a single system. According to Marcel, this can be done for both SNMP-based and agent-based hosts. With the SNMP variant, however, it is also possible to have the simulated data generated by a third-party system. A performance test is not possible by simulating the data – so this is just a function test, for example to try out your own developments, as Marcel further explained.

The conference participants were delighted that Marcel also presented complete instructions for testing and changing versions. Many mistakes can be avoided. Marcel also showed how scripts can automate many tasks. He recommends that each major releases should be introduced in a timely manner. According to Marcel, the dynamic development of Checkmk and the constantly growing range of functions means that an update jumping major releases can be more complex than a regular update of Checkmk.

As a Checkmk consultant, Marcel wants users to be as independent as possible, and ideally to have no problems at all when using Checkmk. Unfortunately, the world of IT is complex and therefore will always have challenges that can only be mastered together. Helping our customers has been Walter Fisch’s job as Head of Support since January 2020, as he explained in his subsequent lecture about how customer support should be easier in the future.

Checkmk 2.0 comes with new support diagnostics

Walter wants users to receive help as quickly as possible in the event of an emergency. We initiated the foundations for this last year with the introduction of Jira Service Desk.

In the future, he plans to introduce a knowledge base to save time – especially in the case of recurring problems. The knowledge base should contain, for example, an FAQ, how-to instructions, and troubleshooting documents. In addition we also want to integrate existing information from our Werks and the forum.

Walter Fisch is our Head of Support.
Walter introduced our new support diagnostics to the particitpants of the Checkmk Conference.

However, a particularly important element for Walter is our new support diagnostics, which we are introducing with Checkmk 2.0. We are currently doing an enormous amount of work to obtain basic information, such as installed operating system, etc., for a support case. In the future, we want to collect this automatically using a standard format, and in this way make the handling much easier. Further, we plan to significantly reduce the demand for additional data from the customer in order to be able to start problem solving more promptly. On the other hand, it should be easier for the customer to combine the information about the problem in a data package and attach it to the support ticket.

Selecting data directly in Checkmk

With the new support diagnostics, it is possible for the user to generate the required information via WATO by running a background job to create the data package from the selected site. Alternatively, it is possible to create the information via the CLI. A cleanup job on the sites should subsequently prevent the local file system from being overfilled with this diagnostic data.

Checkmk then does not automatically forward the created data (tar file) to tribe29, emphasized Walter. The customer or partner must always upload the tar file manually to our support portal via an SSL connection or, in the case of larger amounts of data, to our cloud storage. It is also possible for a customer to send the tar file to a support partner. We will delete the submitted diagnostic data as soon as we have completed the ticket.

Of course, we also considered the security of your data for this workflow, and took some security aspects into account. Only the owner of the ‘cmkadmin’ role is authorized to create and read diagnostic data. Checkmk also offers full transparency concerning the data collected, which can also only be transmitted to us manually via SSL. We are also thinking about how we can mark sensitive data and then mask it.

Our next post will be the last post regarding our Checkmk Conference #6. In it you will receive everything important about our further Checkmk roadmap.