IT monitoring is primarily all about being able to reliably detect critical conditions in IT systems. However, this is only the first step, after which an organization must communicate the incident to the relevant parties and ensure that the appropriate specialist personnel can handle the problem. In order to ensure the highest possible uptime, a company must not only detect problems promptly and accurately, but also ensure that the right person receives all of the necessary information as quickly as possible so that they can get to work on rectifying the condition.Checkmk masters the detection of statuses and provides reliable and accurate notifications, in case systems are not working correctly or are at risk of doing so. For this function, most Checkmk users have already customized their monitoring and receive alerts about critical conditions via email, SMS or another notification method. Checkmk sends alerts until the issue is resolved or a responsible employee actively acknowledges the alert in Checkmk.
However, most organizations use more than one tool besides Checkmk that generate alerts. And these alerts need to be sent to the right people. Also, not everyone wants to map the management of IT service workflows completely in Checkmk. That's why in this blog I want to show you how to use iLert to streamline your alerting processes. The combination of Checkmk and iLert allows you to improve the uptime of your IT systems. iLert is a SaaS solution for managing alerts and on-call management. It improves collaboration between on-call teams and other parties, reducing the time between reporting and fixing a problem – the so-called Mean-Time-To-Recover (MTTR).
How iLert works
iLert collects alarms from various sources and optimizes the handling of incidents that have occurred. Checkmk is one of many possible alert sources for iLert. The control for all alerts takes place centrally in the iLert web interface.
Checkmk can, for example, send alerts to the respective responsible employees and also map workflows for incident handling. However, managing workflows via iLert is simpler and you can adapt them much more easily, for example, if employees need to swap shifts at short notice. With iLert, you ensure that the right employee will always receive precise instructions and can therefore get to work immediately.
Just like Checkmk, iLert has integrations with outbound tools such as Slack, ServiceNow or Jira, and you can also map workflows for alerts in iLert itself. For example, if an employee on call doesn't respond within 30 minutes, iLert can automatically contact another team or employee. In addition, iLert can manage calls on your customer hotline and route incoming calls to the right contact based on existing on-call schedules and escalation chains.
As requirements for availability times and incident processing continue to increase, iLert often replaces homegrown solutions. iLert relieves monitoring managers because it is easy to use as software-as-a-service. In addition, iLert comes with important features for large enterprises such as high availability, extensive scalability across distributed environments, and much more.
You can test iLert free of charge and without obligation. Thanks to its native integration, the combination of Checkmk and iLert can be set up in a matter of minutes. In the next blog post, I'll show you how easy it is to integrate Checkmk as alarm soure in iLert and give a few examples of how you can use iLert to improve your uptime in practice.