Checkmk Conference #6 goes digital. Get your tickets here!
According to a survey among our users, Amazon Web Services is currently the most important provider of cloud-based Services, so goes without saying that Checkmk must here provide excellent monitoring.
Checkmk contains a comprehensive monitoring for AWS which consists of a connector to AWS and an impressive collection of check plug-ins consisting of various metrics for the retrieval and evaluation of states. Because of the amount of check plug-ins only some of them, to show the AWS web services that Checkmk can currently monitor:
- AWS EBS Summary
- AWS EC2 Instance Status
- AWS ELB Statistics
- AWS ELB Application Statistics
- AWS ELB Network Statistics
- AWS RDS Database Info
- AWS S3 Summary
- AWS Glacier Summary (beginning with 1.7.0)
- AWS Cloudwatch Alarms
- AWS Costs and Usage Summary
For a complete, up-to-date list of all available plug-ins, see the Check plug-ins Catalog.
2. Concrete Implementation of AWS Monitoring
2.1. Hosts and services
In Checkmk all objects to be monitored are arranged in a hierarchical structure of hosts and services. With cloud-based services the concept of hosts does not now apply. To retain the simplicity and consistency of Checkmk, we still however map AWS objects according to the host/service schema.
How to do that can best be illustrated by an example: In one region several EC2 instances have been configured. An EC2 is usually assigned to EBS. This constellation looks like this in Checkmk:
- There is a host that matches the AWS account. This host gives an overview of all EC2 instances and their status as a service.
- The EC2 instances themselves are their own hosts.
- On these EC2 hosts you can find services with the actual metrics.
- The EBS are interpreted as a type of hard disk, and accordingly provide metrics to I/O (e.g., the number of bytes read or written). For this purpose there are separate services in Checkmk with the name
>EBS Disk IO per EBS which are assigned to the EC2 instance.
2.2. Access to AWS
Of course AWS does not allow installation of a Checkmk agent, and this is not actually necessary since it provides an HTTP-based API over which monitoring data are also available.
3. Preparing AWS
3.1. Creating the user
To enable monitoring via Checkmk, it is best to achieve it by creating a special AWS user under your root account. Log in to AWS as the root user, and navigate to Security, Identity, & Compliance ➳ IAM (Identity and Access Management). Go to Users and create a new user Add User. As a user name choose, for example, check-mk.
It is important that you select the Programmatic Access for Access Type.
Under no circumstances should the user receive any rights for amending the monitoring. You can simply assign the user check-mk the single policy ReadOnlyAccess (or you take the trouble to restrict your account with more detailed policies):
3.3. Access to Billing-Informationen
If you want Checkmk to have read access for the billing information (in order to perform the Costs and Usage global check), you need another policy for your AWS user – a policy you must first define yourself.
Under Security, Identity, & Compliance ➳ IAM ➳ Policies select the Create Policy button. Select the Billing service from Select a Service ➳ Service ➳ Choose a Service. Under Actions tick the Read checkbox. Click Review to go to step two. Set the name BillingViewAccess as Name and save with the Create policy button.
You must now add this new policy to the user. Go again to Security, Identity, & Compliance ➳ IAM ➳ Policies – in the Filter Policies search box look for BillingViewAccess, select this by clicking in the circle link, and then go to Policy actions ➳ Attach. Here you will find your check-mk user, select this and confirm with Attache policy. The following message will be received once it has been executed successfully:
After completing the user creation an access key will be generated automatically for you. Attention: The secret of the key is displayed only once – directly after the creation. Therefore without fail copy the key and save it, for example, in the Checkmk password store. Alternatively specify it in plain text as a rule (see below). For Checkmk you need the Access Key ID in addition to the secret. The name of the user (in our example check-mk) does not matter here.
If for some reason you should lose the secret, you can create a new access key for the user and get a new secret:
4. Configuring monitoring in Checkmk
4.1. Create a host for AWS in Checkmk
Now create a host to monitor AWS in Checkmk. You can assign the hostname as you wish. Important: Because AWS is not a service it has no IP-address or DNS name (access is granted by the special agent itself), so you need to set the IP Address Family to No IP.
4.2. Create a rule for AWS agents
AWS cannot be queried through the regular Checkmk agent. Set up the AWS Special Agent now. To do so, under Host & Service Parameters ➳ Datasource Programs ➳ Amazon Web Services (AWS) add a rule whose conditions apply only to the just-created AWS-host.
For the actual content of the rule, you first need to find the information for the login. Here enter the Access Key ID of the created AWS user check-mk. Also choose here which global data you want to monitor, i.e., those that are independent of a region. That is currently only the data on the costs:
The really interesting data is assigned to regions. Therefore here select your AWS region(s):
Under Service by region to monitor you specify which information you want to retrieve from these regions. At default all AWS web services and the monitoring of their [Monitoring_aws#limits|limits] are activated. In the following screenshot are all but one deactivated to get a better overview:
You now can restrict the fetched data per web service or globally with Restrict moinitoring services by one of these tags. The global restriction will be overwritten , if you restrict by web service! Also you not only have the option to restrict by AWS tags but additionally to specify the explicit names:
4.3. Services on the AWS host itself
Now go to the service discovery of the newly created AWS host, where WATO should now find several services. After you add the services, after an Activate Changes it will look something like this :
4.4. Create hosts for the EC2 instances
Services associated with EC2 instances do not become the AWS host, rather they become so-called piggyback hosts. This works in such a way that data retrieved from the AWS host is distributed to these hosts, and they work without their own monitoring agents. Each EC2 instance will be assigned to a piggy-host, the name for the EC2 instance will be derived from the private DNS name.
The piggy-hosts are not automatically created by Checkmk. Create these hosts either manually or – from version 1.6.0 - optionally with the new Dynamic Configuration Daemon (DCD). It is important that the names of the hosts exactly match the private DNS names of the EC2 instance – they are also case-sensitive!
By the way – with the auxiliary script find_piggy_orphans from the Treasures Directory you can find all of the piggy-hosts for which there are data even if the hosts themselves have not yet been created as hosts in Checkmk:
OMD[mysite]:~$ share/doc/check_mk/treasures/find_piggy_orphans ip-172-31-44-50.eu-central-1.compute.internal ip-172-31-44-51.eu-central-1.compute.internal
Configure the EC2 hosts without IP addresses (analogous to the Azure host), and select No Agent as the agent.
4.5. Hosts for the ELB (Classic Load Balancer)
4.6. Monitoring limits
Some web services of AWS do have limits and Checkmk is able to monitor them. Here some examples:
- AWS EBS Limits
- AWS EC2 Limits
- AWS ELB Limits
- AWS Application and Network Limits
- AWS Galcier Limits
- AWS RDS Limits
- AWS S3 Limits
- AWS Cloudwatch Alarm Limits
As soon as such a check plug-in creates Services and checks them later on, the special agent will always fetch all elements of the web service that has the activated limits monitoring. Only in this case Checkmk is able to compute reasonably the utilization and check the thresholds. That's also the case even if you restrict the fetched data by some tags or names.
The checking of the limits is activated by default for each monitored web service. If you want to restrict the fetched data in special agent rule to limit the amount of transferred data, you need to deactivate the monitoring of the limits, too.
4.7. Further services
The other services in AWS are assigned as follows:
|CE||Costs & Usage||At the AWS host|
|EBS||Block Storages||Appended to the EC2 instance if it belongs to the instance, otherwise to the AWS host|
|S3||Simple storages||At the AWS-Host|
|RD||Relational databases||At the AWS-Host|