Mastering AWS monitoring for optimal performance

Proper monitoring of your AWS infrastructure is essential for ensuring availability, detecting issues early, and optimizing costs. Discover how AWS monitoring is done in practice, and how it helps in achieving maximum efficiency and reliability

What is AWS cloud monitoring?

AWS monitoring is about observing AWS-native solutions, resources, and services, hosted on the AWS cloud. Monitoring these means being able to collect, aggregate, and analyze metrics that inform administrators and cloud architects on the health and performance of applications running on the AWS platform. AWS monitoring is the part of the broader cloud monitoring that focuses on monitoring AWS environments.

As a cloud service, AWS is vast and includes dozens of usage cases. Therefore, AWS monitoring can become a quite complex task for IT administrators, depending on how many services are used and how they interact with each other.

AWS cloud offers a large range of services for companies that want their IT infrastructure to be performant and easily scalable. Databases, storage, virtual networks, computing environments, load balancers, virtual machines and much more can be hosted on the AWS cloud. These cloud services are suitable to host your IT infrastructure exclusively or to create a hybrid setup together with a local infrastructure. AWS monitoring is thus not isolated from local monitoring efforts, and often a single monitoring solution is implemented to monitor it all.

Given the vastity of what AWS offers, it is useful to go through specific areas of what is possible with this cloud platform, and what and how it can be monitored.

AWS EC2 dashboard in Checkmk

What areas need to be monitored on an AWS cloud?

AWS clouds are composed of multiple areas, dealing with different resource and computational needs. Depending on what you are using, only some of them may need to be monitored. For instance, using serverless applications or microservices will probably call for setting up an AWS computing monitoring strategy. Storage monitoring, on the other hand, will focus on AWS S3 monitoring, which is the AWS service for storage. Or,, if you are using a load balancer, AWS load balancing monitoring is another area to look into. It largely depends on how you use your public or private AWS cloud in practice.

AWS performance monitoring and AWS security monitoring are broader areas that influence all of AWS cloud services and resources. They need to be the focus of any AWS monitoring tool, as they impact every type of AWS infrastructure setup.

At a bare minimum, AWS application performance monitoring is to be implemented on any cloud environment. Along with general performance and security monitoring it will ensure that AWS applications are optimally running.

AWS performance monitoring

Performance is always a key area to monitor. AWS clouds have lots of parts that influence the overall performance of an application, and therefore more than one tool exists for AWS performance monitoring. AWS itself comes with a few tools for application performance monitoring. Trace data, for example, is provided by AWS X-Ray. It tracks the execution of an application and reports a large set of technical information on it for debugging purposes. AWS Config provides an inventory of all configuration of the AWS resources. It is a helpful tool not only for its primary purpose, identifying misconfigurations, but noticing how configuration changes may impact performances.

AWS performance monitoring is mainly about checking the real-time metrics provided by AWS on its resources. EC2 instances, RDS databases, S3 buckets and others have individual signals for how well they are working. The main AWS tool to monitor these metrics is AWS Cloudwatch, which we will address later when talking about AWS service monitoring. For small applications and workloads, Cloudwatch has a clear-cut dashboard that can help administrators identify performance bottlenecks on their AWS cloud. For more complicated tasks, a dedicated AWS monitoring tool that provides a greater deal of customization and collectable metrics is often more preferable.

AWS performance monitoring is as much about AWS application performance monitoring as resource monitoring. Both need to be checked to have a complete view of how your AWS environment is performing.

AWS network monitoring

In both, full cloud and hybrid infrastructures, there is always more than one network operating at all times. The networks can span multiple datacenters, both private or public, and possibly increase packets’ latency. Even though AWS clouds are greatly optimized to reduce network issues, it is still essential to implement AWS network monitoring.

It is especially vital to be able to monitor both on-premises and cloud networks. Nowadays, it is rare to use only one exclusively, so your monitoring solution has to be able to monitor both to avoid an inconsistent monitoring and management experience. AWS has a few integrated tools that are helpful to get a view of your AWS networks. AWS Transit Gateway Network Manager can monitor your cloud networks as well as your on-premises ones. It can respond to connectivity problems and has a unified interface to identify any issues with a single glance.

To access the actual metrics related to AWS networks, a good starting point is monitoring the various Elastic Network Interfaces (ENI). These are virtual network cards in a VPC (Virtual Private Cloud) on AWS, exposing basic characteristics of a real network card, such as IP addresses – both public and private –, Mac address, security groups, source/destination flags, and a description. However, these are only basic information. To have a more holistic monitoring, you should definitely collect more metrics, for example through AWS APIs or by using a third-party AWS cloud monitoring solution.

Helpful are also network logs. Especially those of the VPCs and Load Balancers can give you a lot of insight into how, or if, an AWS network works. They are rather manual to check, thus a local monitoring solution that can export more metrics from AWS through a custom agent is highly preferable.

AWS security monitoring

Security is always an important factor to consider in any organization, and especially if a part of your IT infrastructure is delegated to a third-party service. Therefore, it is vital to be able to monitor the security of AWS monitoring. AWS itself has a handful of tools that can help with AWS security monitoring.

AWS CloudTrail is the AWS service that can enable operational and risk auditing, compliance, and governance for the whole AWS account. It records a series of events that can be monitored in order to ensure that their resources and data were accessed by authorized actors only, helping you to identify and respond to unusual activity. CloudTrail works with Amazon GuardDuty, which is another AWS service that analyzes events from CloudTrail along with different types of logs to identify potentially malicious or unauthorized activity across the whole AWS account. Both are fundamental for basic AWS security monitoring. They can easily be replaced by third-party AWS cloud monitoring solutions, though.

AWS security monitoring is not only about identifying unauthorized accesses and events. It also includes preventing exploits of software from malicious external agents. AWS monitoring solutions can scan AWS services to discover software vulnerabilities and unintended network exposure. For AWS clouds, Amazon’s Inspector tool performs this task. Similarly, AWS Config can assess and report on the configuration status and changes, easing troubleshooting and compliance audits.

All these tools are specific to AWS, and don’t really offer a unified view. Therefore, using a centralized cloud monitoring solution has the great advantage of visualizing the same or similar features in a single dashboard and reporting system.

AWS Lambda monitoring

A pillar of serverless computing, AWS Lambda is one of the most commonly used AWS services, making AWS Lambda monitoring a necessity to ensure that the applications running on AWS are optimally performing, not under stress, and not possibly causing outages. AWS Lambda monitoring consists of controlling the usual metrics related to any application, like CPU usage, memory, disk, and network utilization.

Amazon offers Lambda Insights within the broader AWS CloudWatch service. Each Lambda application exports its key metrics to CloudWatch for analysis and troubleshooting as needed. These same metrics can be externally monitored through a monitoring software that supports AWS cloud monitoring.

Lambda applications have their own logs, which are also available to CloudWatch. As is clear by now, these logs are exportable and possible to view with an external AWS monitoring solution as well.

AWS Lambda custom dashboard

AWS S3 monitoring

AWS S3 (short for Simple Storage Service) is the AWS cloud service for data storage. This does not only include data for applications, but also backups and long-term storage of documents. These data can be hosted on various locations and with different tier usages, depending on their level of desired availability.

Ensuring that S3 data is safe and untouched is the task of AWS S3 monitoring. It naturally overlaps with AWS security monitoring, since data is a part of the overall resources of an infrastructure, and needs to be secured from external parties. To guarantee that data is not lost or tampered with, AWS offers a few possibilities. The first option is the S3 Access Analyzer that alerts about S3 buckets (a virtual container for objects stored on S3) which are publicly available on the internet or outside your organization. For each bucket, information about the level of access and its source are provided. Misconfigured file permissions against an access policy are also reported.

Along with the usual availability of logs, it is possible to have a good idea of what files have been accessed, by whom, when, and if changes were made.

To enable actual security of the data, Amazon includes the S3 Inventory tool. It is used to audit and report on the replication and encryption status of your objects. In this regard, AWS S3 monitoring is basically a branch of the broader AWS security monitoring.

AWS S3 dashboard

AWS EC2 monitoring

For cloud workloads, and for basically having a virtual server available for your computational needs, AWS comes with EC2 (Elastic Compute Cloud). Many applications are run on EC2 instances, making AWS EC2 monitoring an important aspect of AWS monitoring. Unsurprisingly, by now, both Amazon and third-party solutions can monitor EC2 to check on your virtual servers and detect any problems.

AWS provides some metrics to inform about the health of EC2 instances. These metrics include CPU, network, and disk utilization, as well as disk-specific performance metrics such as overall reads/writes, disk space, page file, and swap utilization, along with memory usage. They are all necessary in order to knowif your applications have enough resources to operate.

AWS CloudWatchalso includes AWS EC2 monitoring and combines the metrics mentioned above in its dashboard. Similarly, plenty of other cloud monitoring systems, such as Checkmk, collect these metrics as well, and offer you a holistic view of your whole infrastructure no matter where it is located.

Services of a EC2 instance in Checkmk

AWS RDS monitoring

Databases are part of every infrastructure, and an AWS cloud is no exception. As the repository for key operational data, database monitoring is naturally an important part of any company’s monitoring efforts. On AWS, the RDS (Relational Database Service) supports a vast choice of databases (MySQL, PostgreSQL, MariaDB, Oracle and SQL Server) to fit any need.

AWS RDS monitoring is the branch of AWS monitoring that takes care of monitoring all types of databases. AWS collects a series of key metrics to inform you on how well the databases are performing. The main ones are number of connections, amount of read and write operations, amount of storage, memory and CPU used by each database, and network traffic directed to the database. As with other AWS services, RDS logs are available to use for further insights and analysis when monitoring your AWS databases.

All these metrics and logs are available to CloudWatch and to the more specific Amazon RDS Performance Insight tools. External agents, like those used by Checkmk, can collect both metrics and logs to work with complete cloud monitoring solutions without relying on Amazon’s proprietary tools.

AWS load balancing monitoring

Load balancers are services that automatically distribute incoming traffic across multiple targets. For instance, EC2 instances, containers, and a set of IP addresses can be put behind a load balancer to distribute requests. Every time a new request comes in, the load balancer takes care to compute what of the resources it manages is more free and can thus take charge of the request. A load balancer is a key element of cloud scalability, and naturally AWS has its one.

On AWS, it is called Elastic Load Balancer, and is the component that needs to be monitored when doing AWS load balancing monitoring. It may only be a small part of an infrastructure, but an extremely important one nonetheless. If the load balancer fails or misbehaves, some resources would see a large increase in workloads, while other resources are left to idle.

AWS Elastic Load Balancer exposes, to CloudWatch and externally, metrics such as the total number of TCP connections active from clients, the number of non-compliant network requests, the number of redirects, the number of targets that are healthy (thus considered open to accept requests), and many more. Along with its logs, AWS load balancing monitoring can easily be done with either CloudWatch or an external cloud monitoring tool.

Monitoring AWS resources and costs

The last area to monitor is AWS global resource usage and its relative impact on costs. Any cloud service comes with costs that need to be budgeted. The resource usage is directly related to cost, with some free tiers available before fees become mandatory.

AWS offers some tools to keep the costs under control. AWS Cost Explorer, once enabled, creates 24 hours reports on the current and forecasted costs for your AWS cloud service utilization. It is part of the broader AWS Billing and Cost management service, which includes usage reports. Third-party AWS monitoring tools support checking your cost and resource utilization on AWS, and are usually a better choice since their reports are more complete and contain customizable monitoring dashboards for visualizing AWS costs.

In case budgeting is needed to better plan your AWS cloud usage, AWS Budgets is another service from Amazon that helps you set up different budgets, and alerts you when you went over budget and where. If that happens, there is yet another tool that can help you identify the cause of surpassing your intended expenditures on AWS cloud, AWS Cost Anomaly Detection. It is a monitoring system just for AWS costs that can analyze seasonal and past usage patterns, set thresholds, and send alerts for identifying the cause of overspending.

It is clear how taking control of your expenses on AWS cloud service is not an easy task. Multiple tools exist to facilitate it, but external AWS monitoring solutions generally offer more features than Amazon’s tools.

AWS cost and health dashboard

Available AWS monitoring tools

In the course of this article, many tools have been named already. However, most of them have different use cases. While some focus on specific areas of AWS cloud monitoring, such as AWS security monitoring, AWS performance monitoring, or AWS application performance monitoring, others are even more specific, and monitor only a particular service on the AWS cloud. Such is the case with RDS monitoring, Lambda monitoring and monitoring AWS-specific services. Others still, like CloudWatch, gather metrics from multiple sources, and focus on providing a holistic view of the health of an AWS cloud.

All the services above are offered by AWS and can monitor only AWS clouds. And even though CloudWatch can also partially monitor on-premises services, it is unable to monitor other cloud services, like GCP or Azure.

It is plain to see how for moderate needs, like of a small company perhaps, the AWS monitoring services are sufficient. But as soon as the infrastructure grows or there’s a need for more customizability and flexibility, an external AWS monitoring tool is essential. These are third-party tools that can be installed on AWS themselves or used from local servers. They can monitor a great range of AWS services, while also merging the data from on-premises and clouds into a single view.

Such is the case with Checkmk: it can be used both from a local installation, or deployed via the AWS Marketplace (from version 2.2 onward). The collection of metrics is possible through interfacing with the AWS APIs or by using installed monitoring agents on the AWS cloud itself.

Large infrastructures can be monitored more easily with a unified tool that can collect all the metrics from a large variety of sources. This includes not only AWS services but also other clouds, as an environment usually contains more than one cloud provider.

Why it is important to monitor an AWS cloud

An AWS cloud is a complex ecosystem made of thousands of services and resources. It is easy for any of them to go awry and cause issues to the others. Outages, reduced performances, and increased costs are only a small part of what a misconfigured or malfunctioning AWS service may cause.

AWS monitoring solutions aim to prevent all this, or, in the worst case, give the IT administrators enough information to understand what went wrong and how to fix it. Either way, without an AWS cloud monitoring system in place, none of the above would be possible.

AWS recognized the need for monitoring their cloud platform and now offers multiple tools for this task. There are also plenty of third-party solutions that can give you greater insights in what your AWS cloud is doing and how. Checkmk is only one example among many.

Whatever solution you choose for your cloud, monitoring AWS services, performance, security, and resources is vital to ensure that it works smoothly, and to prevent outages. The amount of data that is exchanged and hosted on an AWS cloud is large and important enough to warrant setting up an AWS cloud monitoring solution. Simple or advanced, not monitoring such a vital part of your IT infrastructure is like welcoming a disaster.

In modern infrastructure, cloud services are a key element. Monitoring them is therefore a key task for any IT administrator that has to manage a hybrid or fully cloud-based environment.

FAQ

What is AWS SNS?

AWS SNS is a service for AWS clouds to set up and send notifications from the cloud. It allows administrators to send notifications through SMS messages or mobile push notifications, informing users or administrators of events related to their AWS cloud services or account.