The monitoring platform for every need
Checkmk is a comprehensive platform for monitoring of applications, servers, and networks on-premises or in the cloud. Thanks to its wide range of features, Checkmk effortlessly monitors both the simplest IT operations and complex IT environments.
Automate monitoring to save time
- Add new components with less effort using automatic detection and configuration: Checkmk will recognize them and provide monitoring for all relevant their components, along with their metrics and thresholds.
- Features such as host lifecycle management and auto-registration of hosts help you automate monitoring of dynamic, ephemeral infrastructure: Containers, pods, VMs and more can be added and removed from monitoring automatically.
- Use modern rule-based 1-to-N configuration, which remains intuitive even in complex environments and results in lower configuration effort than other monitoring solutions.
- Automate the configuration and operation with the Checkmk REST API.
- Centrally manage your agents and automate agent updating with the Agent Bakery.
- Integrate other systems using powerful APIs to automate almost anything imaginable.
From zero to monitoring in ten minutes
- Fast installation from a single integrated package, without the need to separately install and maintain databases and web servers.
- Available for various Linux platforms, as a Marketplace image for AWS and Azure, as a container image for Docker, and as a virtual or physical appliance.
- Checkmk's auto-discovery detects your hosts and services for you – effortlessly configure your monitoring with all relevant metrics and thresholds.
- Integrate your data: Combine the benefits of agent-based monitoring with those of agentless monitoring via HTTP or SNMP – or connect Checkmk to different applications via APIs.
- Configure everything in a web interface. Fast, easy, and less prone to error.
- Apply your existing role-based access controls (LDAP, AD) to a fine-grained permission model for user and group actions.
Ready for powerful hybrid IT monitoring
- Over 2,000+ maintained plug-ins collect metrics from your systems across heterogeneous IT infrastructures.
- Checkmk covers not only the most important use cases in the cloud, but also most on-premises systems thanks to its unique plug-in collection – for powerful hybrid IT monitoring.
- Benefit from actively maintained, regularly updated plug-ins that keep up with your software and hardware changes.
- There are additional plug-ins shared by our community in the Checkmk Exchange to complement our native plug-ins.
Scale monitoring with a performance-optimized, distributed architecture
- Easily monitor thousands of services with one monitoring instance, eliminating the need to maintain and synchronize multiple monitoring instances in a single data center.
- Scale across hundreds of sites and millions of devices. Build a world-wide distributed monitoring network, achieving a scale that is hard to find in monitoring systems.
- Leverage highly efficient, self-contained agents with minimal CPU, RAM and storage utilization. They run on even the smallest servers, without the need for DLLs or libraries.
Modern monitoring concepts for cloud and on-premises
- Ingest data with high enough granularity to handle IT architectures of all kinds – traditional environments and container orchestration platforms included.
- Sample in real-time, with measurement intervals as short as 1 second.
- Take advantage of special features such as host lifecycle management and auto-registration to automatically map ephemeral hosts and workloads in dynamic cloud and microservice environments in monitoring.
- Identify problems in your IT with a meaningful status ("OK", "WARN", "CRIT") for each monitored component or system – including deeper analysis with one click.
- Analyze the state of your IT systems with just a few clicks and view the health of your IT in the right context with Checkmk.
- Map application dependencies in an overview and monitor complex systems at a glance.
- Tag your data by hand or auto-discover tags and labels to provide relevant context to help you filter – labels offer full flexibility and tags ensure consistency.
- Store metrics in disk-space-efficient long-term storage.
Get detailed insights into your network
- In-depth analysis of your network traffic with the integration of network flows into Checkmk via ntop
- Traffic dashboards for your network
- View alerts, characterized by duration, severity and alert type
- Filter flows in many dimensions to analyze your networks
- Detailed views for your hosts: traffic, packets, ports, peers
Easily customize or extend to meet your needs
- Customize or extend the Checkmk source code, written in easy-to-read Python.
- Rely on us and our broad network of partners to customize Checkmk or its plug-ins.
- Program your own plug-ins for Checkmk using the new Check-API, or write local checks in any programming language.
- Learn from the extensive developer documentation.
Visualize your data with modern, customizable dashboards
- Get full visibility on the state of your IT, thanks to Checkmk's modern and customizable dashboards.
- Out-of-the-box dashboards provide key metrics for AWS and Azure cloud environments, Linux and Windows servers, and Kubernetes clusters.
- Leverage graphic maps and diagrams with live monitoring data.
- Analyze time-series metrics over long time horizons with interactive HTML5 graphs.
- Customize dashboards and views to your specific needs with different dashboard elements to visualize your most important metrics.
- Compare metrics across multiple graphs at a glance.
- Custom dashboards and views for users or user groups, e.g. vSphere specific views for VMware admins.
- Customize the side menu according to your preferences: add snap-ins for your most important monitoring information, as well as links to access to your most relevant functions or reports.
- Alternatively, visualize your data in Grafana using the Grafana Checkmk datasource plug-in or using Checkmk's Graphite exporter for InfluxDB.
Avoid notification overload with smart and granular alerting
- Notify the responsible team quickly – e.g., notify the storage admins for a failing disk – and escalate problems if they are not handled in time.
- Automatically send notifications via email, SMS, Slack or MS Teams.
- Generate tickets automatically for incident handling through integrations with ITSM systems, such as ServiceNow, Jira, PagerDuty, or VictorOps.
- Configure additional alerts or cancel an alert in specific situations.
- Handle alerts centrally – even in distributed environments.
- Use the alert handler to automatically trigger actions upon detection of new problems – e.g., for self-healing.
Combine metrics and log data for fast problem identification and root cause analysis
- Monitor events from sources such as syslog, SNMP traps, Windows event logs, log files, and other applications.
- Filter and forward events, triggering scripts or generating notifications.
- Collapse duplicate entries into a single event (e.g. several failed user logins) to prevent operator overload.
- Filter incoming messages to only show important events – no more manual filtering and information overload.
Predict trends and resource utilization with advanced analytics
- Analyze historical data to identify trends or forecast future resource consumption.
- Use sophisticated predictive monitoring algorithms to dynamically adapt thresholds based on historical events.
- Rely on capacity management that incorporates one-time or seasonal factors in its forecasts.
Monitor the health of your key business processes
- Monitor business processes by mapping application dependencies into a single overview and view the availability and performance of complex systems at a glance.
- Aggregate various services and hosts into a single state.
- ‘Freeze’ aggregations and compare them to the live state of your infrastructure to visualize changes and receive an explanation for any resulting status changes.
- Review historical states to determine the root cause for degraded performance.
- Deliver more reliable services to customers by maintaining awareness through a completely transparent, easy-to-understand view.
- Support all possible setups – high availability with two or more nodes, HPC, and more – with unprecedented freedom.
- Simulate worst case scenarios in real time, studying the impact of failing components to determine areas of operational weakness.
Identify all assets in and within your IT
- Identify and inventory all hardware and software, proactively monitoring changes.
- Integrate regularly updated monitoring data, such as CPU utilization or disk usage, into your CMDB view and add "dynamic" parameters to the health of your assets.
- Combine inventory tables with service or other inventory table data in one unified view.
Proactively keep the business informed with automatically generated reports
- Create branded PDF reports that include all the views you create – ad-hoc or automated at regular intervals.
- Review historical states over any period of time with a single click, calculating availability in real time.
- Average cleanup of available data: exclude unmonitored times, change resolution, or ignore short intervals.
- Monitor your complex SLAs to receive notification before you violate your SLA contracts – even if the SLA definition only includes work hours.
Integrate with major ITOM/ITSM tools to streamline workflows
- Use powerful, well-documented APIs to build deep integrations with popular ITSM and messenger solutions.
- Receive notifications via email, SMS and messenger.
- Streamline your processes by having tickets automatically created in your project management system.
- Interface with standard, off-the-shelf Configuration Management Database (CMDB) software.
- Configure monitoring using existing information from a Configuration Management Database (CMDB) via Checkmk's APIs.
Connect your monitoring to other tools
- Visualize your data from Checkmk along with other data sources by exporting metrics and labels to Grafana / Grafana Cloud.
- Bridge the gap between DevOps and Ops teams with the Prometheus integration and import K8s data and Prometheus alerts into Checkmk.
- Import monitors and events from DataDog into Checkmk to improve communication between OPs and DevOps teams.
- Combine metrics from Checkmk with metrics from other monitoring tools in the TSDB of your choice, such as InfluxDB 2.0, for centralized metrics monitoring.
Note: Some of these features are available in the commercial Checkmk Editions only
What do you like about Checkmk?
See what our users have to say