Complete monitoring for any type of server
Monitoring your servers is a business-critical task. And still, most companies face disruptions in server performance or even permanent damage. Slow and unstable performance of IT infrastructure leads to inefficiency, unsatisfied customers and a loss in revenue. But what is server monitoring and why do these problems occur when it is not set up properly?
The main reasons are blind spots in your server monitoring or the fact that important assets had to be excluded from the monitoring system for other reasons. Without visibility, you cannot gain insights. This lack of detail thus leads to issues that cannot be localized and mitigated.
Physical servers need monitoring just like virtual servers do. The latter also includes cloud assets and virtualized servers. You can very quickly end up with many different types of servers that are set up for various purposes.
Depending on the environment, there are several sources for gathering monitoring information that you should pay attention to. Windows, Linux and other operating systems provide output for monitoring. Vendors have their own interfaces for monitoring bare-metal servers. IPMI and SNMP can also be used to monitor servers.
In the case of virtualized servers, you can get data from the hypervisor and other sources as well. If you are working together with a cloud service provider, you can use their APIs to extract monitoring information and get it into your own monitoring platform.
The whole system is extremely lightweight, 4 CPUs and 4 GB memory are more than enough for medium sized environments (~200-300 hosts).
Since it's open source you can always extend the system or look into the code to find out what it does on the most detailed level.
The main challenge in server monitoring is balancing all of these different monitoring tasks, and having suitable approaches for individual scenarios. Checkmk is the all-in-one monitoring platform, and it provides agents for all established operating systems (Windows, Linux and others), but it is also able to monitor virtualized servers (such as Microsoft Hyper-V or VMware ESX), and servers in cloud environments (for example AWS or Microsoft Azure).
Checkmk discovers your hardware & software automatically, and you can use it for centralized asset management to track changes in your inventory. And with the Raw Edition you can do all that for free.
Advantages of server monitoring with Checkmk
Covering all operating systems
Checkmk comes with +1,900 official integrations that are ready for use, including agents for systems such as Linux, Windows, or macOS. It also supports agentless monitoring via SNMP or IPMI.
No dependencies for the future development of your infrastructure
We have plug-ins for all common cloud and virtualization solutions, so that you can adapt your infrastructure knowing your monitoring will adapt with it.
Easy to get started
In just a few minutes you can have your monitoring up and running. Installing Checkmk is easy, and you can manage the complete monitoring through the graphical user interface. If you use Checkmk as an appliance, you do not need any prior knowledge of Linux administration.
Get started: The basics of server monitoring
Now, which values and functions should you track? It doesn't matter whether you observe a virtual web server or a physical Windows server, these four areas have to be part of the monitoring of any server:
- CPU: Is there enough capacity for all operation, and is the computing power efficiently distributed?
- RAM: Do you have enough memory for all applications and the cache?
- Block or object storage: Is there enough storage on the system, and does the data match the expected throughput?
- Network: Who has access to the data on the server? At what speed is the data transported to which areas?
This is just the beginning. The challenge is to grasp the complex interactions of your server landscape. Most infrastructure today is not dependent on physical assets. There are more virtual servers and assets hosted in the cloud.
Monitors everything from network equipment, power supply, hardware, OS, virtualization technology, databases, storage systems, backup and even modern stacks like docker.
Also helps with automated SLA reports und software/hardware inventory.
In addition, container solutions like Docker or Kubernetes allow flexible infrastructure deployment without requiring a central management. This is indispensable for dynamic IT processes, but also comes with a new challenge for monitoring.
Checkmk can monitor all resources in one platform: It can monitor highly agile dynamic environments, but also static assets. The approach is not limited to servers only, but can also monitor network devices, data centers and other assets. This way you can analyse issues in depth.
Finding the right monitoring tool
A first area to consider is the technical maturity of a product. It has to be able to gather detailed information from a host while using as few resources as possible. It also should not interfere with the operations of the server, but rather use passive mechanisms.
Checkmk combines status-based, metric-based and log-based data in one tool, so that you can create warnings and alerts for every server with minimum effort.
From a strategic perspective, a tool should be open to users without prior monitoring experience – this is the only way to avoid silos. Checkmk provides a comprehensive handbook, video tutorials and trainings.
Checkmk additionally comes with its own graphing, dashboarding and reporting engines to create metrics and analyses for different roles. It also integrates with external visualization tools like Grafana. This way you can make the best use of the data – no matter if you are a monitoring veteran or work in a totally different department.
And last, but not least, you should consider the total costs of ownership that go beyond the costs for software licenses: An IT department needs to spend time on administration and management of the monitoring – and should not be overwhelmed with false alerts in day-to-day operations.
Tools with a low usability or high user requirements lead to more expenses as more budget has to be spent on hiring matching talent. There is also a similar risk in case a tool is not able to be scaled. The costs for additional hardware, service providers and consultants can get out of hand if your monitoring is not optimized for larger environments. Checkmk has a highly efficient architecture, and it requires minimal hardware resources. It is easy to implement and can be operated with minimal manpower – even in very large environments.
Checkmk is a very intuitive, easy to use and easy to setup monitoring system. It scales perfect vertically and also horizontal.
Expert insights into the daily best practices
The decision for which data you should monitor depends on the exact setup. If you deploy your servers onsite you have to keep an eye on the hardware, of course. If you run a virtualized environment, you still have to monitor the hardware if it is provided by yourself.
On top of that, you need to keep an eye on the hypervisor and the virtual machines (VMs) as well. If you run cloud servers you may shift some responsibilities to a service provider, but – or maybe especially because of that – you should still check these assets. In this way, you can check on different service level agreements (SLA). And ultimately it makes sense to monitor your spending on a service provider too.
Here is a practical example to show the importance of having data in depth.
Imagine monitoring a system on which the value for the CPU load stays above the total number of available cores for a longer period. The server is underperforming. But at the same time the CPU usage is not even close to 100 percent, so you probably have an issue within the server.
If you only monitor the CPU utilization, you would not be able to see any issues. If you just focus on the CPU load, you will get an alert, but probably falsely increase the number of cores. It is likely that many tasks block the cores, but the tasks do not seem to get processed on time. With Checkmk you have a rich source of data and could have a closer look at the throughput of the storage or the RAM and could investigate why things are being delayed. The in-depth information helps companies in making better decisions.
Agents, SNMP or IPMI: Pros & cons of different approaches
The basic architecture of the monitoring makes a big difference, because it not only decides on the quality of the information you get, but also on the level of security and the system requirements of your monitoring.
Because IT environments can change quickly, a platform should not be limited by its design. If companies adapt new technology their monitoring should adapt as well.
Over the years Checkmk has developed an extensive set of high-performance integrations (1.900+ plug-ins) for various operating systems. Because of our experience we suggest using agent-based monitoring as the first go-to approach for servers.The agents require minimal CPU and RAM on the host. Additionally, the workload on the monitoring system itself is kept to an absolute minimum. Its footprint in the network is also smaller compared to other technologies.
Using Checkmk agents is very secure, too, because the connection does not need to make an active call. On the host the agent works in read-only-mode and supports end-to-end-encryption.
While there are benefits that speak in favor of using agents, Checkmk also works well with other sources of information. If using software on the host is not an option, you can use the native integration of Checkmk with SNMP or IPMI.
In the daily routine our agents have proven their ability, especially if combined with our agent management tool, the agent bakery. Especially in larger server environments many assets can be handled and adjusted easily and without writing permissions for the hosts. Malicious code injections are impossible thanks to our security-by-design approach. Compared to monitoring via SNMP, WMI or other approaches, the requirements for resources on the host and on the Checkmk server will always be smaller if you rely on our agents.
How to set up a holistic server monitoring within minutes
Getting your monitoring started with Checkmk only takes a couple of minutes. The Checkmk instance itself is deployed as a Linux server. After the deployment, you are good to go. There is no need for additional software or another database. All editions are available on the Checkmk homepage. You can also run Checkmk as a Docker image, and you can use Checkmk as a physical or virtual appliance. So you don’t need to have any experience with Linux to use Checkmk.
After the installation, you can work with the graphical interface or the command line to interact with Checkmk. Nearly any host can be integrated into the monitoring, and there are several ways to do so – including auto-detection mechanisms for integrating large numbers of hosts or dynamic environments automatically into Checkmk. Aspects of each host will be attached in the form of services.
If you use agents to monitor your servers, they will connect to TCP port 6556. Only on receiving a Checkmk server query will they be activated and then respond with the required data. Check intervals can be as short as one second. If you are using SNMP Checkmk will send a UDP package (Port 161), and receives an answer in the same format.
Agents and ports can be adapted, and Checkmk is able to use any input as long as it conforms with the guidelines for coding check plug-ins. The easiest way to extend agents is with local checks. Alongside the more than 1,900 official plug-ins, you can find extensions written by our Checkmk Community members; most of these are available in the Checkmk Exchange.
Linux server monitoring
Examples are Red Hat Enterprise Linux, Fedora, CentOS, openSUSE, SLES, Debian, Ubuntu – but there are many more. This is no surprise as Linux is known for running very stably and efficiently.
Checkmk can monitor any Linux server because our agent consists of a simple shell script that routes data to the TCP port 6556.
Checkmk picks up the information and gets it into the monitoring. The agent is not compiled, and all functions are fully transparent. Even if this sounds simple, knowhow is important when it comes monitoring Linux servers.
For example, Linux has its own way of managing its memory, and you should consider that in your monitoring.
We have trusted in Linux for a long time and are happy to share our experience. Follow the link below if you want to read more about monitoring your Linux systems.
Checkmk is easy to configure but also really flexible for all kinds of special requirements.
Windows server monitoring
Windows server are known as easy to manage, and they are mostly used in environments containing mainly Windows clients. Organizations usually rely on the Windows Management Instrumentation (WMI) to monitor them, but this approach is quite resource-intense.
It probably works fine for midsize environments, but it will pretty soon reach its limits in larger server landscapes, because more and more resources have to be provided to make it work. Checkmk takes a different approach, by providing agents in the form of MSI packages.
These do not depend on WMI as the source of information and work a lot more efficiently. This allows monitoring of large Windows environments, and better scalability.
Using a simple executive file without any DLL dependencies makes the operation more stable and secure. Also, by design no data can be injected from the network.
In addition the agent is less than five megabytes in size, and you can compile the source code completely yourself. There are no backdoors, and Checkmk runs completely transparently. The agent can also be expanded at will.
Windows servers are usually part of a larger Microsoft ecosystem. If you follow the link below you can find more details – not just on Windows servers, but also on monitoring Windows environments, including MS Exchange, Active Directory or MS SQL.
We are monitoring IT infrastructure from UPSs over servers, hypervisors, network, SAN and NAS storage up to the operating systems and databases as well as SAP for over 80 customers with Checkmk and have made excellent experiences.
Monitoring of virtualized servers & containers
Server virtualization enables companies to make better use of their hardware resources. If you are monitoring virtual machines (VM), you need to go beyond the four basic dimensions. Additionally, you also have to track the virtualization platform; the availability of actual resources and their distribution to the individual VMs – otherwise you are at risk of allocating capacities that are in reality not available.
Platforms like VMware vSphere, Citrix XenServer, or Microsoft Hyper-V provide their own tools that deliver information on their VMs via interfaces. Checkmk can use that data and provide further context from other sources, or can compare different data sources.
You will have all analyses and alerts in one tool. The opportunities of virtualization change with innovations like Docker or Kubernetes. Containerization allows for shorter life cycles of infrastructure. They can also be rolled out from any location and do not need approval from a centralized infrastructure department.
Checkmk can automatically add/delete containers and pods of only a few seconds life-time. It is suitable for monitoring Docker, native Kubernetes and OpenShift-Kubernetes. And you can run Checkmk as a Docker-container as well.
Supported container platforms and providers
Monitoring servers in the cloud
Nearly any organization deploys at least some servers in the cloud. With Checkmk you have all insights into your assets.
Integrations with AWS and Microsoft Azure enable you integrate all information of your cloud environment into your monitoring and control your budgets. In addition, you can check whether SLAs and other agreements have been met, and you still can monitor on the operating system level to gain more insights.
Even complex architectures with multiple cloud providers and hybrid environments can be monitored in depth. You are able to monitor services from third parties, load balancers and unconnected devices, and track down any issues related to these systems.
We have only good experiences with Checkmk. We use it in large environment without any issues.
Centralised asset management for servers
Servers are in pivotal points of digital interaction, but you should also check on what is going on within your servers. In practice, this means creating an inventory of your installed hardware and software. You can systematically check this information and even make comparisons over different time frames with the right monitoring tool.
Monitoring enables you to see broken hardware or changes in the hardware setup. Missing hard drives, broken RAM or malfunctioning memory blocks are made visible immediately.
Without monitoring your inventory, partial malfunctions are usually hard to localize. Memory or storage devices will be listed as fully-functioning in the OS – and even users experience errors. Analyzing the provided log data will help you in finding the origin of such events.
You can also track the time and version of a BIOS/UEFI-Update, or changes on the OS. Checkmk allows you to check on updates of applications and look for newly installed applications.
Checkmk in just a few minutes after being installed will provide you with deep insight and visibility into your systems and applications.
You can export the information into a license management system to prepare yourself for audits or to find outdated software. Integrate the Checkmk HW/SW Inventory into a configuration-management database (CMDB) to always have an overview on your IT assets.
Have a monitoring that adjusts to the needs of your servers
In some cases you need to have very tailored rules and should apply individual thresholds. For example, if you have a server optimized for quick replies. To do so it should reserve enough CPU and memory for incoming requests. You can address that in the monitoring by lowering the thresholds for alarms and warnings.
This is typically the case for databases. Response times decide on competitiveness in the online world. Requests are highly volatile and even in peak times responses need to be handled quickly. Checkmk is ideally suited for monitoring databases, and it has integrations into all major vendors like:
Monitor open-source databases like MySQL or MariaDB. Plug-ins for databases and Galera clusters are available.
Checkmk is ideal for monitoring Microsoft databases, and there are several plug-ins for MS SQL for monitoring databases, their servers and tablespaces.
Stay on top of your PostgreSQL databases: Monitor connections, sessions and other details with Checkmk.
As well as Oracle DBs, Checkmk has several plug-ins at hand to monitor Oracle Clusterware, Instances, storage management (ASM), and the Oracle Recovery Manager.
Monitoring of NoSQL databases is not an issue with Checkmk. Monitor replica sets, clusters, memory and much more in MongoDB.
Keep track of databases, tablespaces and instances with the plug-ins for IBM Db2.
Monitor relational database management systems (RDBMS) from IBM. Checkmk provides plug-ins for IBM Informix to ensure your monitoring works effortlessly.
Checkmk comes with a wide range of plug-ins for monitoring SAP HANA, including checks for database storage and other important metrics.
Microsoft Azure SQL
There are several plug-ins for monitoring Microsoft Azure, including an own plug-in for Microsoft Azure SQL.
Amazon Web Services RDS
Checkmk is ideal for monitoring your assets in Amazon Web Services. There are a couple of plug-ins ready to be used for monitoring AWS RDS.
Monitor & Relax: Check your Couchbase nodes and buckets with detailed metrics thanks to the available plug-ins.
Web server monitoring: Things you should consider
There is almost no company that can survive without digitalizing their products. The more you open yourself to the online world, the closer your eyes should be on your web servers. Make sure your homepage never goes down, and enable yourself to be able to detect issues as soon as possible.
For web servers the monitoring of the connections and metrics around data requests are a key focus. High response times indicate problems with the web server or homepage design and need to be reported. In addition, large data packets should be avoided when it comes to response size.
Especially when using apps for mobile devices in combination with slow Internet connections, you quickly create a nightmare for your customers. With the right monitoring in place, you can be sure of avoiding these things.
Checkmk is able to run active checks with HTTP, FTP & SSL, and there are official plug-ins for Apache or Nginx web servers. You can not only monitor the basic functions of the servers, but you are able to pull data from different sources to create precise alerts focused on the vital functions of your web servers.
Most important facts when monitoring mail servers
Emails are still in the very center of business communications and most companies have mail servers in their networks. The first priority for their operations is reliability, since if mail servers go down, many organizations will not be able to work. If you actually start monitoring your mail servers, you should think about several other challenges too.
One point is the efficiency: Having too many mail servers with unused capacities is not only a waste of your resources, but also adds more workload on IT teams as they have to maintain more servers. In addition, you need to check on data privacy, access rights, backups and security. And finally, companies have started using cloud services, or migrate their email archives.
All that can be addressed in your monitoring to make your mail server infrastructure more secure and reliable. We provide checks for POP3-, IMAP- or SMTP to make sure your mail servers are fully-functional. There are also official Checkmk plug-ins for MS Exchange, Postfix, qmail and other mail applications.
Support of vendors and interfaces
Sometimes monitoring via SNMP or the operating system is not an option. This is the case, with bare-metal servers, for example. Among Checkmk's +1,900 plug-ins you will find several integrations to specific server vendors. Checkmk also supports IPMI. You can be assured of having lean processes and always using the most efficient way of getting the monitoring data.
Checkmk integrates with systems like the Cisco UCS C Series, or blade servers like Dell PowerEdge, HPE BladeSystems, or Fujitsu Siemens Primergy BX600 Blade. Besides plug-ins for hardware, there are also integrations into vendor-specific tools like Dell OpenManage Server Administrator (OMSA).
Many customers use Checkmk to monitor their data centres. In addition to servers, Checkmk also can monitor power supply units (PSU), uninterruptible power sources (UPS), temperature sensors, cooling, voltmeters and environmental sensors.
Keep an eye on larger server environments
The Checkmk Enterprise Editions bring several features that are especially beneficial for professional users. IT admins, DevOps teams, data center operators and other specialists find all the information they need in one platform: Dashboards, alerts, reports and much more. Views, access and administration rights can be adjusted to a very detailed level for different user groups or individuals to meet the needs of different roles.
Checkmk additionally provides key features for enterprise environments:
- Include even most complex and heterogenic infrastructures into your monitoring. More than 1,900 official plug-ins help you to get started in no time.
- Auto-discovery of services and predefined thresholds allow you to set up a monitoring within minutes.
- Rule-based configuration helps you to adjust your monitoring with just a few basic rules.
- The ability to roll out updates automatically with the Checkmk agent bakery. You can create individual agents with different setups and save them for re-use.
- Low check intervals as short as one check per second to gather information.
- Full scalability in any environment: Monitor onsite assets, hybrid infrastructure or multiple cloud vendors with just one tool.
- The options to deploy Checkmk instantly as a virtual or physical appliance.
- Low total costs of ownership (TCO) thanks to low system requirements, the fair licensing model, and several automatization options.
FAQ server monitoring
What can I do when my servers are not on the same network? Can I monitor servers remotely?
Yes, you can. Checkmk features distributed monitoring, so that you can keep an eye on servers in various locations without much additional effort. Read more about distributed monitoring in the official handbook
Do I have to expect performance issues due to monitoring?
No, Checkmk is an extremely lightweight and is built for high-performance and the efficient use of resources.
Checkmk leaves a small footprint in the network and on the host, especially if you use our agents. If you use SNMP to monitor assets, you have to expect more network traffic.
Does Checkmk support agentless server monitoring?
Yes, Checkmk can work agentless, and can monitor any endpoint without installing additional software on the hosts.
However you are not forced to work agentless. There are more than 1,900 official plug-ins. As well as the agents there are integrations into SNMP, IPMI and other interfaces.
Why should I implement an additional tool, if server vendors already provide monitoring tools?
Server monitoring goes beyond tracking the status of your hardware. Performance and availability are so crucial to your business success that it makes sense to gather all available data into one platform. This will allow you to detect critical situations quickly.
By just checking on certain parts or single servers, you will miss relevant details and context – even if these checks are triggered periodically by an IT admin.
Are there free tools for server monitoring?
Yes, the Raw Edition of Checkmk is completely free and fully open source. Several companies use it to monitor their servers and other IT assets.
At which point does it make sense to switch to a commercial platform?
There is no written rule for this. There are great open source solutions available. The major benefit of a commercial version is the improvement in efficiency. If you spent a serious amount of time with monitoring, you should consider changing to a commercial tool, because it will give you back some of your valuable time.
A good platform should also come with professional customer support and improve your ability to analyze data. These are additional points that mainly concern professional users.
Another point with open source platforms is that just the tool itself is free, but trainings and consulting usually require budgets. These costs can be affected by changing to a commercial platform too.