Check manual page of kube_collector_info
Kubernetes: Cluster Collector for Vanilla Kubernetes, AWS Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), VMware Tanzu
| Included in | All Checkmk editions |
|---|---|
| Source Code License | Open Source |
| Supported Agents | Kubernetes |
The connection status to the Kubernetes API is implicitly reported by the Checkmk service. This service has a similar function in terms of the connection to the Cluster Collector deployed to the Kubernetes cluster.
The service has three tasks. It reports on the metadata of the collectors that are deployed to the Kubernetes cluster (including cache health) and also provides reports on the data processing of the collected data. It displays additional information in case a problem occurs during the data handling process within the agent. Moreover, the service uses the Kubernetes API to report on the status of DaemonSets belonging to node collectors. For each node collector the following are reported:
- the number of Nodes with a Pod, which is available and desired
- the number of Nodes, on which a Pod is desired
The DaemonSets are identified by the labels "node-collector=machine-sections" and "node-collector=container-metrics".
The service allows for setting the alert status when machine sections cannot be successfully fetched and processed from the cluster collector. When container metrics cannot be fetched and processed, the service will always go CRIT.
Additionally, Cluster Collector cache health is monitored. When the container metrics or machine sections cache exceeds configured thresholds, Checkmk will alert. (In this case, it is likely that you need to increase the maximum cache size of the Cluster Collector, which can be done via the official Helm chart by setting `clusterCollector.cacheMaxsize`.) The thresholds for this check are configurable with the "Kubernetes Collector info" ruleset.
Note, that the information displayed by this service changes if the corresponding collector is changed.
Discovery
One service is created if the option `Use data from Checkmk Cluster Collector` is set.