Ep. 26: Monitoring Kubernetes with Checkmk
Read Video Transcript
|[0:00:00]||Hello, welcome back to the Checkmk channel. Today we will talk about Kubernetes monitoring.|
|[0:00:15]||Kubernetes is a great tool. It comes with a lot of nice features and also has self-healing capabilities. But it's not just magic. Things at some point will go wrong, things will fail. And for that, you better have monitoring in place.|
|[0:00:29]||With the all new Checkmk 2.1, we bring you all completely new revamped Kubernetes monitoring with a lot of nice features which will make monitoring your Kubernetes clusters a breeze.|
|[0:00:39]||Today, we will take a look at that. So, this is how our Kubernetes monitoring looks like.|
|[0:00:44]||Before we go into actual installation and setup of it, I just want to give you a quick look into the end result what you will get when we are done with a couple of simple steps to go there.|
|[0:00:55]||This is a cluster dashboard which helps you to visualize all the important metrics and health of your applications and workloads in your cluster.|
|[0:01:05]||It's just the beginning of your trip into visibility into Kubernetes. From there on you can actually dive down into individual dashboards.|
|[0:01:14]||For example, you could go down into the Deployments dashboard for the specific deployment which seems to have a problem, where we can see something clearly in red. Hey, there's something wrong. And we could go down here to see what exactly is wrong.|
|[0:01:29]||We could see in the detailed metrics on the deployment, on the Pods belonging to deployment, the problems of this deployment all in one view.|
|[0:01:37]||And now let's go into the next step. How to actually get started with that?|
|[0:01:43]||Okay, let's install the Kubernetes collectors which we need to get the data from your Kubernetes cluster. These Kubernetes collectors you can find on our github repository on the tribe29/checkmk_kube_agent.|
|[0:01:57]||And here you just have to go into the folder deploy, charts/checkmk. Because here you will also find the documentation how to install it.|
|[0:02:06]||For the installation, you need a recent Kubernetes version, 1.19+ is quite old so you should be able to fulfill that as a prerequisite and you need helm as a tool.|
|[0:02:17]||Helm is basically a package manager for Kubernetes. First thing we do is we add the repository. I take this command.|
|[0:02:29]||I give the repository a name. I call it tribe29 and it has been added. The next thing I need to do is I need to update my repository. Simple command, and done.|
|[0:02:48]||Okay, so we just updated the helm repository. And before we go into the next step of actually installing the Kubernetes collectors, we will get a file called values.yaml.|
|[0:03:00]||Yeah, you can find it up here. It helps you to configure your Kubernetes collectors. It simplifies that quite a lot. You can specify If you want to have TLS communication enabled.|
|[0:03:14]||You can specify there If you want to have port security policies enabled. And all other nice things in a really easy manner.|
|[0:03:22]||And we will do that by downloading this. I already prepared a command for that. Here let's just download it. And now we have it on our system.|
|[0:03:36]||I will change a couple of values in there to make the Kubernetes collectors exposed to the outside world. Because by default, the Kubernetes collector which we will install is not accessible from the outside.|
|[0:03:53]||But for Checkmk to be able to actually pull the data from your Kubernetes cluster, we have to do so. For that, I go down here until I end up at the service.|
|[0:04:10]||And there are two options which you can use. You can either expose the Kubernetes service of the cluster collector or you can enable an ingress depends how your kubernetes are set up.|
|[0:04:24]||Here I have a very simple Kubernetes cluster. I don't have any ingress that's why I'm going to go for the option of NodePort. So, I just have to specify NodePort here and I uncomment this line.|
|[0:04:40]||And that's the only change I need to do to basically expose my cluster collector to the outside world on that specific port on that node.|
|[0:04:53]||We will not go into how you can execute that for now. That is part for another video.|
|[0:04:57]||And for now this is only internally accessible for me in my internal environment, so it's fine for me to have unsecured communication here.|
|[0:05:06]||But for any productive cluster, I highly recommend to use TLS or to make sure safe communications by using an ingress with the built-in capabilities of that ingress.|
|[0:05:19]||Okay, now we have configured this values.yaml and we can go back into the documentation here and copy this command.|
|[0:05:33]||In this command, we first have to specify the Namespace in which the Kubernetes collector will be installed. I use checkmk-monitoring. Let's give it all the release name. You can do whatever you want, I just choose checkmk.|
|[0:05:49]||And we have to specify the repository and that has to be the same name which I used above, and above I used tribe29.|
|[0:05:58]||And I also passed the values.yaml to overwrite the standard configuration of this helm chart to expose the cluster on the NodePort 30035.|
|[0:06:15]||Okay, let's see what happens. Okay, and the cluster collector was successfully installed and the helm chart also provides us with a lot of useful commands to get started.|
|[0:06:28]||Yeah, these commands are all helpful for you to actually configure the Checkmk connection to the Kubernetes cluster collector.|
|[0:06:37]||Among the first thing is how you can actually access the cluster collector. This is this part. The second part is the tokens and the certificate for the connection towards the cluster collector and the Kubernetes API.|
|[0:06:54]||I will just copy these commands, execute them. And then if I take a look at them, I will be able to see the token which I need for communicating to my cluster and I will also be able to get the ca certificate.|
|[0:07:14]||These are two things which I then need for configuring my Checkmk Kubernetes collector connection in Checkmk.|
|[0:07:21]||Okay, now we have everything what we need to actually configure Checkmk to be able to monitor Kubernetes. I have created a completely fresh site.|
|[0:07:33]||And in this site the first thing which we do is we go to Hosts and add a host for our Kubernetes cluster. Now I call this kube-internal.|
|[0:07:47]||I give this No IP because this is just there for collecting the data. This is basically one host which is the destination for all my data to be located on.|
|[0:08:02]||Next thing which I do is I just think it's just nice is to create one folder where I just put everything. I call it k8s-objects.|
|[0:08:13]||And as next thing is we create a password, we use the password of Checkmk to store the token for the connection to the cluster collector and the Kubernetes API.|
|[0:08:29]||I give this name Kubernetes Internal Token. And now I go to my console, copy the token which I just got before, go back to Checkmk and save it.|
|[0:08:46]||Just a neat thing to have it because now it's encrypted on my disk, which is neat.|
|[0:09:07]||We go here and we add a new CA certificate. We go back to the console. We copy the certificate. Save. And now we have done all the stuff which we need to be able to access it.|
|[0:09:26]||Now you can actually configure the connection. For that, just go to Setup type in something with Kubernetes and we will find here the rule under VM, Cloud, Container, called Kubernetes.|
|[0:09:37]||We can add a rule here. Okay, let's configure the kubernetes rule. The first thing which we need to do is we have to give our cluster a name. I call it internal.|
|[0:09:49]||Then we have to assign a token. We take the one from the password store which we just specified. Next thing is we have to specify the endpoint of the API server.|
|[0:10:02]||If you don't know the IP address or the FQDN of that, look at the configuration of your kubectl tool and you'll find in the kubeconfig there the address of the API server.|
|[0:10:17]||We obviously want to verify the certificate. And this next step we have to enrich the data which we get with the data from the Checkmk cluster collector.|
|[0:10:29]||For that, I use whatever I have specified before and when we deployed the cluster collector into our Kubernetes cluster. I selected NodePort. You can obviously also use ingress.|
|[0:10:41]||And for NodePort, I just have to specify the IP address of one node. That's something which every Kubernetes cluster can do. I haven't enabled HTTPS in production, please do.|
|[0:10:57]||And now we're almost done. We just have to assign this rule to a host. I assign this rule to the kube-internal host and that's it. Let's save the rule, activate it, and then take a look at our hosts.|
|[0:11:22]||So, we have our kube-internal host which we just created and that one. First thing which we do is we look into the Checkmk service discovery and we edit the services.|
|[0:11:36]||And we can already see, hey, it has discovered a couple of things. It has discovered some Kubernetes services.|
|[0:11:44]||We accept them all it's receiving metrics, it's receiving CPU usage of your cluster. The Kubernetes API is live and ready. We see memory matrix. We see how many nodes we have in our cluster and how many pods.|
|[0:12:07]||With that, however, we only have one host in Checkmk now. We obviously want all the pods, all deployments, and all the other things also to appear in Checkmk.|
|[0:12:17]||And for that, we have to activate the Dynamic host management. I create a new connection. I call it Kubernetes. I add here under Piggyback creation options a new element.|
|[0:12:38]||I create the hosts in the folder which I just created. You can choose whatever you want. And I also choose to delete hosts which don't have Piggyback data.|
|[0:12:49]||This Piggyback data mechanism is the way how we get the data from the cluster collector into our Checkmk.|
|[0:12:57]||And we will use this feature to get the stuff in now. And I will restrict the source hosts to my kube-internal because I only want this dynamic configuration to be applied to that specific host.|
|[0:13:16]||Let's save that, activate the changes. And we can already see all the hosts which are being created.|
|[0:13:25]||So, Checkmk itself now is discovering everything in your Kubernetes cluster. All the pods you can see here are being created.|
|[0:13:34]||All the Namespaces, deployments, the immune sets, every object in your Kubernetes cluster is being monitored now. Checkmk will take care of that for you so that you don't have to do that.|
|[0:13:47]||Okay, we can see up here that we have already now 87 hosts in our Kubernetes in our Checkmk monitoring.|
|[0:13:56]||And the first thing which we can do is we can take a look at the Kubernetes dashboards which I showed in the beginning. For that, you go to Monitor and under Applications, you will find Kubernetes.|
|[0:14:14]||In here you can see now our cluster which we just configured with already some metrics. And we can go now into the cluster, dive deep in, and we see the cluster dashboard.|
|[0:14:26]||It's obviously pretty empty at the moment because all the data is being gathered at the moment.|
|[0:14:31]||Let's give it a short moment and take a look at how it looks like in a couple of seconds.|
|[0:14:36]||Okay, and just a couple of seconds later, we have our entire Kubernetes monitoring ready and running. We can see immediately which Namespaces we have, we can see our workloads in here.|
|[0:14:53]||We can sort that if we want to see, okay, which workload has a lot of load in there. We can sort them also by memory consumption to identify the top consumers.|
|[0:15:04]||And I'm gonna disable the sidebar so that we can see it full. The first metrics are already coming in here.|
|[0:15:13]||We can see, okay, this is how the CPU resource is now over a cluster and we can now also go in. We can, for example, say hey, I want to see everything in my Namespace default.|
|[0:15:24]||And we can, for example, check out the Namespace dashboard. You can see immediately which workloads are running directly in default. And then, for example, we can go further down, here as well into a specific deployment.|
|[0:15:43]||This was the installation and configuration of the Kubernetes monitoring with Checkmk 2.1.|
|[0:15:46]||I hope you liked it. In our next video, I will show you how you can actually do alerting for your Kubernetes cluster.|
|[0:15:46]||Thanks for watching and please like and subscribe.|
More Checkmk Videos
Ep. 1: Installing Checkmk 2.0 and monitoring your first host
In this video, Baris explains how to take get started with Checkmk and start monitoring your first host within a few minutes.
Ep. 2: The Checkmk 2.0 user interface
In this video, Baris take you through the new user interface in Checkmk 2.0. He explains the various components of the User interface such as the new navigation menus, the Sidebar, main dashboard, tactical overview, how to switch between the Checkmk interface themes and much more
Ep. 3: Using SNMP to monitor network devices in Checkmk 2.0
In this episode, Baris explains how to monitor network devices with Checkmk. SNMP is a protocol that many switches, routers, printers, UPSs, hardware sensors and other devices have implemented with the purpose of being able to monitor them easily.
Ep. 4: Monitoring Windows in Checkmk
In this video of our Getting started with Checkmk series, Baris explains how to install a Checkmk agent on a Windows host system and add that into your monitoring environment.
Ep. 5: Using metrics and graphs in Checkmk 2.0
In the 5th episode of the Getting started with Checkmk series, Baris explains using various metrics that you can monitor in Checkmk such as CPU utilization, CPU load etc. You can also see graph visualizations for these metrics or create and customize your own as per your requirements.
Ep. 6: Updating Checkmk 2.0 and using multiple instances
In this video, Baris explains how to update your Checkmk instance. It is very easy and can be done within minutes. You can run multiple Checkmk instances with different versions on the same system. This gives you the flexibility to test the new version before using it in production.
Ep. 7 (part 1): Working with rules and setting thresholds in Checkmk
In the following three-part videos series, Baris explains rule-based monitoring with Checkmk. In the first part, he shows you how you can work with rules and set threshold values. Rule-based configuration is one of the key features for Checkmk which helps you to scale your monitoring easily within minutes.
Ep. 7 (part 2): Smart rules with Host Tags in Checkmk
In the second part of this video, Baris explains using Smart rules with host tags in Checkmk. In the first part, he shows you how you can work with rules and set threshold values. These are features that you can use to build your rules even more intelligently and to better organize your monitoring.
Ep. 7 (part 3): Managing Hosts in Folder in Checkmk
In this final part of our episode on Rule-based monitoring in Checkmk, Baris demonstrates how to manage hosts in folders in Checkmk. This helps you to apply your monitoring configurations at scale and organize your hosts according to your needs.
Ep. 8: Working with Host and Service Groups in Checkmk
In this Baris demonstrates how to create host and service groups in Checkmk, so you can perform actions on an entire group instead of configuring each of them individually.
Ep. 9: Using the Quicksearch function in Checkmk
In this episode of the Checkmk tutorials, Baris shows how you can use the Quicksearch function in Checkmk. You can use it to easily find and manage certain hosts or services. He also explains some examples of filters to you. In Checkmk 2.0 you can use the same syntax in the Seach function found in the monitor menu to get identical results.
Ep. 10: Detecting configuration errors with the Analyze Configuration feature
With the Analyze Configuration feature, you can check if there are any configuration errors in your installation. Checkmk controls a number of possible security risks or potential performance restrictions and indicates if there are any problems.
Ep. 11: View creation and customization in Checkmk
In this video, Baris demonstrates how to customize headers, columns, and more in Views in Checkmk for yourself or other users. He also explains how to create custom views and add desired information to these views.
Ep. 12: Acknowledging problems in Checkmk
In this video, Baris explains how you can acknowledge problems in Checkmk. This function helps you to qualify the states of hosts and services. This allows you to keep track of messages in the main dashboard and, for example, you can add comments to problems.
Ep. 13: Scheduling downtimes in Checkmk
In the episode of our Getting started with Checkmk series, Baris explains how you can manage the maintenance times of your systems in Checkmk. Such scheduled downtimes prevent your monitoring from sending false alarms when a host or service goes to WARN or CRIT during maintenance work. You can also inform the users concerned about the maintenance via Checkmk.
Ep. 14: Distributed monitoring with Checkmk
In this video, Baris explains how you can connect several Checkmk instances to a monitoring system and then manage it.
Ep. 15: MKPs and Plugins in Checkmk
In the 15th episode of our Getting started with Checkmk tutorial series, Baris explains what are Checkmk Extension Packages (MKPs) and how easy it is to integrate them into your Checkmk monitoring environment. MKPs are the preferred format when you make your own extensions as it makes it easy to share with other users or deploy in distributed environments.
Ep. 16: Working with 'Bulk Actions' in Checkmk
In this episode of our Checkmk tutorials series, Baris explains how you can save a lot of time with bulk actions. With this feature you can perform various tasks such as deleting, renaming, service discovery etc. on a large number of hosts simultaneously.
Ep. 17: Working with network topologies in Checkmk
In this video of our gettign startted with Checkmk series, Baris explains how to map network topologies in Checkmk. This feature is quite helpful to manage your network and prevent any unnecessary notifications from the devices in your network.
Ep. 18: Creating and customizing dashboards in Checkmk
In this video of our Getting started with Checkmk series, Mathias explains how you can create and customize dashboards in Checkmk 2.0, so you can get insights into your monitoring according to your requirements. Find out more in this video.
Ep. 19: Monitoring websites and their certificates with Checkmk
In this episode, Bastian demonstrates how to monitor a website and its certificate with Checkmk. You can also monitor specific web pages with Checkmk by using the several options that will suit your use case. Learn more in this video.
Ep. 20: Configuring dashboard elements in Checkmk
Learn how to add data visualization elements of the various metrics into your Checkmk Dashboard. In this video, Mathias explains how you can configure these elements and create a dashboard as per your requirements.
Ep. 21: Setting up notifications in Checkmk
Learn how to set up notifications in Checkmk and assign relevant contacts and contact groups to be notified for various events. Later in this video, our presenter Bastian also demonstrates how you can set up rule-based notifications according to different conditions for hosts and services.
Ep. 22: Monitoring logfiles with Checkmk
Monitor your logfiles with Checkmk using its Logwatch plugin. It is very useful when you want to monitor your logfiles regardless of whether you are using a UNIX/Linux or a windows based system. Learn more in this video.
Ep. 24: 3 Rules for efficient network monitoring
In this video, Bastian demonstrates 3 rules that will help you to efficiently monitor your network interfaces. With Checkmk 2.0, with just three rules, you can set up an efficient network monitoring that will not only monitor all of your network interfaces but also simultaneously provide a detailed overview of all of your ports.
Ep. 25: New UX and security improvements in Checkmk 2.1
Checkmk 2.1 come with many UX improvements such as pre-built dashboards for Linux and Windows, faster core performance and much more. Security features such as two-factor authentication etc. were also added in this new version. Watch this video to learn how to use these new features and enhancements in Checkmk.
Ep. 28: Working with InfluxDB integration in Checkmk
Learn how to send data to InfluxDB from Checkmk. As InfluxDB introduced a new protocol to send data to it, a new connector was developed with Checkmk to talk natively with it. Learn more about it in this video.
Ep. 29: New agent architecture in Checkmk 2.1
With Checkmk 2.1, the agent architecture was modified to enable performance improvements and add new features such as TLS encryption, data compression, and the reversal of direction of communication from the agent. This will enable push mode and pull mode.
Ep. 30: Clustering the Checkmk appliance
In this video, Robin demonstrates how you can cluster your Checkmk appliance to make it resilient against hardware failures. If you are using the Checkmk hardware appliance, it may be helpful to cluster your appliance to maintain high availability.
Ep. 32: Working with the Agent bakery in Checkmk
In this video, Robin demonstrates how to roll out agent packages with the required configuration for different monitored systems using the agent bakery in Checkmk. The "Automatic agent update" is quite a helpful feature as it pulls the latest configurations for an agent automatically and you don't need to manually update all of your agents deployed on different systems.
Ep 33: Monitoring Docker containers with Checkmk
Learn how to monitor Docker containers with Checkmk.In this video, Robin demonstrates the process of setting up a rule to configure the docker plugin and bake an agent with the desired settings for the Docker host.
Ep 34: Introduction to Checkmk Ansible collection
Last year the Checkmk Ansible collection was created to interact with the Checkmk REST API. In this video, Robin demonstrates how you can use this Ansible collection to automate your monitoring with Checkmk.
Ep 35: Monitoring SQL databases with Checkmk
In this video, Robin demonstrates how you can configure your Checkmk site to monitor your SQL databases. As there are many flavours of SQL databases, the process is mostly the same.