In-Progress
Checkmk has grown drastically with its features over the last years, that the product has also become more complex. This year we have launched a UX project to rethink how you can more easily and intuitively navigate Checkmk.
Our goal is to maintain Checkmk’s flexibility but speed up learning curve for new and ‘casual’ users
We are working on
- New navigation
- Sidebar
- In-page context buttons
- Breadcrumbs
- Powerful search
- Clarity
- Simplified pages/forms
- Clarify of naming
- Fresh design and icons
- Basic vs. advanced user settings
We are building new forms of visualization (dashlets) and work on simplifying the creation of dashboards. These can now be context-sensitive as well, and allow the presentation of specific facts, such as all information for Host X.
We are also planning dashboards tailored to certain applications, in addition to Checkmk itself for vSphere, Linux or Windows.
New dashlets will be
- Single metrics
- Bar charts
- Scatter plots
We are improving the existing integration with Grafana, which is being supported since Checkmk 1.6 for the Enterprise Editions.
Furthermore, the Grafana integration will also become available for the Checkmk Raw Edition.
Network monitoring has always been a core Checkmk strength with focus on network performance monitoring, metrics, interface states and alarming. We have already a broad set of plugins for network monitoring.
Now we go even further. We have partnered with one of the leading providers for flow analysis: ntop.
We are building a deep integration between ntop and Checkmk. This allows for a deeper root cause analyses and network flow analyse, in-depth performance monitoring and support threat detection.
We are building one integrated view for Dev and Infra Ops team to jointly prevent and fix problems faster.
We are developing an integration of top Prometheus ‘Exporters’ for Kubernetes monitoring and support of running and monitoring custom PromQL queries directly from within Checkmk.
Driving scale-up even further is our motto for the current performance improvements we are working on.
We are standardizing and simpifying development with the introduction of several APIs:
- Check-API
- Inventory-API
- Bakery-API
Especially check plug-in development will be simplified thus, not only for us and but also for everyone user buidling their own integrations.
The agent updater from the central site so far always needed to communicate to every agent directly in order to update it. In segmented networks this might not be feasible. In large environments, this causes unnecessary load on the network.
Distributed agent updates will allow a roll-out of agent updates from within remote sites, thus enabling the use of it in segmented networks and also saving bandwith.
Extension of the popular labels and tags feature. Labels will now not only be imported from Kubernetes and Cloud environemnts, but will also be directly discovered and set by Checkmk.
Through the labels you can for example filter views for specific host types, operating systems family and specific versions, or custom labels. You can also define notification conditions based on these labels.
We are building new check plug-ins and enhancing existing checks plug-ins for many systems and applications.
Application Monitoring:
Graylog, Elasticsearch, RabbitMQ, Redis, MongoDB, MySQL, Couchbase, Oracle, Jira, Jenkins, JMX, Apache, SAP, Zerto, Kapersky, and more...
Infrastructure Monitoring:
Huawei, tp-link, PulseSecure, BeyondTrust, Cisco, Arista, Aruba, Entersekt, Dell EMC ECS, ScaleIO, HPE, NetApp, Fujitsu, QNAP, Ceph, Nutanix, Proxmox, VMware, and more...
To complete our already broad set of notification options, we are building integrations for
- Cisco WebEx Teams
- Microsoft Teams
Planned
The first step was to redesign the most used workflows and rework the entire navigation.
We plan to enhance consistency and ensure intuitive user interfaces across the entire product.
Building dashboards should be as easily and intuitive as possible.
We plan to continue our current efforts also in the future to step-up dashboarding usability.
We plan also further application-specific pre-built dashboards.
We plan to increase performance for activating changed for very large configurations and with a large number of configuration users.
Several concepts are being considered, e.g. activating changes for individual folders instead of for the entire configuration.
We plan to improve standard notification and report layouts.
Building on the strong foundation of being able to integrate the most important Prometheus exporters and running PromQL queries natively in Checkmk, we plan to improve this integration further:
- Integrating more Prometheus exporters
- Directly connecting to Prometheus exporters
With the technology stack being established and the first features done, our goal is completeness of the new REST-API: Everything can be done via an API
We plan to build a special agent similar to AWS and Azure:
- Ready-built for dynamic configuration
- Checks for standard services
- Cloud Storage
- Compute Engine
- Cloud SQL
- Cloud Load Balancing
Additional checks are possible based on feature requests.
We plan to extend the monitoring for both AWS and Microsoft Azure.
Besides improving the activate changes process, we plan improvements for systems with a large number of users.
Furthermore, we plan to improve the performance of network scans for very large networks.
2FA is good practice for login procedures.
There are multiple ways how to do 2FA
- Hardware (e.g.Yubikey)
- Software (e.g. Google authenticator)
We currently plan to add optional 2FA to GUI supporting U2F using a Checkmk local validation server.
Under consideration is enabling connection to other validation servers.
We plan to support SAML for Single Sign-On.
The notification spooler optionally forwards notifications from remote to central sites for delivery. We plan to implement end-to-end encryption for this communication channel.
We plan to extend the integrations around the Windows ecosystem, e.g. Office365 monitoring.
We plan to extend the JMX monitoring to enable in-depth monitoring of Java Application Servers.
We plan to further extend our Kubernetes monitoring, especially around usability aspects with pre-built dashboards.
We plan to integrate DataDog as a data source to Checkmk.
Kubernetes will be key to make VMWare vSphere7 much more dynamic and bring the dev and ops world closer.
Software providers will also increase delivery of their software via containers - ops needs to implement and monitor this.
This changes how modern hypervisors work and we will ensure that our vSphere integration remains best-in-class.
Under Consideration
Simplifying deployment of Checkmk in the cloud with standard images as well as new mechanisms for a more cloud-native monitoring approach (e.g. push mechanisms for agents, self-registration)
Distributed tracing across microservices, hosts, and containers via integration of major APM tools (e.g. Jaeger)
Provide deep visibility into end-to-end performance of applications, regardless of where applications are running, via integration of major Synthetic Monitoring tools
- Extended visualization of network topologies and discovery of network topologies via ntop, LLDP and more.
- Enhanced monitoring for virtual networks in virtualization platforms.
- Improved monitoring of network hardware, e.g. including support info from vendors (e.g. PSIRT advisories) and support for monitoring APIs for network devices, e.g. streaming telemetry.
Discovering dependencies using information from physical and virtual networks and application configurations (e.g. vSphere, Oracle, Kubernetes) and visualizing these dependencies.
This will help in
- resolving issues quicker
- preventing issues from happening
- reducing unnecessary notifications
Automatic correlation of metrics to find systems which experience similar atypical behaviour and powerful visualizations intelligently highlighting anomalies using large amounts of data to find root causes quicker.
The business intelligence feature of Checkmk is a powerful way to monitor business processes by aggregating your monitoring to a higher level.
The current method of doing this via rules enables configuring these aggregrations across your entire monitoring easily.
Another approach which is under consideration would be to create aggregations via drag & drop.