What is load balancer monitoring?

Load balancer monitoring includes processes and practices that ensure the efficiency and health of load balancers. Load balancers are key components that ensure the high availability and scalability promised by cloud vendors as they manage the distribution of workloads across cloud resources so that they are neither idle nor overloaded. They are like small semaphores that decide what goes where and when. In the next section we will show the main implementations of load balancers in detail. Load balancer monitoring is not a large part of cloud monitoring, but it is nonetheless an essential one.

Load balancer monitoring involves steps to check the efficiency and health of load balancers. All of the major cloud vendors' implementations, i.e. AWS, Azure, and GCP, make use of load balancers. These are mainly network load balancers, delivering traffic to the right resource, but not exclusively. These cloud load balancers can take on different roles, not only just handling network requests. We will see how these roles differ from one another later and, more importantly, how they can be monitored. For now let's see how load balancers are implemented and used, along with their operational differences.

How does load balancing work?

Load balancers can be defined as the guardians of workloads. They are put before resources and listen for requests, distributing these across the available according to the resources’ current workload. The set of resources over which a load balancer can distribute, can include databases, subnetworks, applications, and in general anything that provides data or computing power. An application load balancer splits requests for an application, while gateway load balancers distribute traffic to multiple gateways, behind which there are, for instance, virtual appliances or API endpoints.

How load balancers decide to assign a request to a resource is established with an algorithm. In the simplest load balancers, a request is delivered to a resource without considering their load level. More advanced load balancers take into account the status of each resource, and avoid giving more tasks to an already burdened one. The former are called static load balancers, the latter are considered dynamic. Other load balancers estimate the time for completion of each task and include this data in how they distribute the workloads, for a possibly more accurate distribution of tasks.

A common load balancer application is when distributing a single service to multiple users. This  is the case for web servers, DNS servers, databases and more. A network load balancer is often implemented in data centers to identify the most efficient network path for communication between two points, for optimizing network utilization and avoiding congestions. Private load balancers only balance traffic generated inside your private networks, while public load balancers are intended for handling external traffic reaching you.

Load balancers do not only distribute requests. An important use of load balancers is to avoid failovers. A resource is constantly monitored to check its availability and, as soon as the check fails, the load balancer is used to switch to another, failover, resource, ensuring continuity of service.

illustration showing hexagons with logos and icons inside

The importance of monitoring load balancers

With the above said, the importance of load balancer monitoring is apparent. More so if we consider that all of the primary cloud services use a form of load balancing to distribute workloads on their systems. You definitely want to make use of their load balancers and know whether these are working efficiently.

Load balancer monitoring can ensure that multiple applications depending on the implemented load balancer are receiving the intended amount of traffic and are not being overwhelmed with requests. The efficiency of your cloud or hybrid infrastructure may well depend on the efficiency of the load balancers.

Monitoring network load balancers can also provide an overview of the trends for traffic being directed to an application. Monitoring is usually done on the application's side, but that tells you only about the requests that have reached the application, not the total that was throttled by the load balancer elsewhere. Load balancer monitoring can add a global traffic overview that goes beyond what is achievable by application monitoring alone.

Monitoring load balancers provides you with an important overview of the scalability of your environment, since often all of the traffic goes through a load balancer, it is easy to use this as an indication of where and when the traffic to your applications go and whether it is time to scale up or down.

An important feature of network load balancers is SSL offloading. Instead of terminating an SSL certificate at each of your servers, this can be performed centrally at the load balancer level, which then acts as a certificate distributor for external requests. If implemented, monitoring the load balancers for certificate expiration is a key task for network administrators.

What cloud load balancers exist and how can they be monitored?

Load balancers' usefulness reaches new heights in cloud services. Given their high scalability and availability, cloud providers make great uses of load balancers to keep their virtual environments operating efficiently. If your infrastructure is partly or fully on the cloud, it is important to know what load balancers AWS, Azure, and GCP, at least, can provide and how they can be monitored.

AWS load balancer monitoring

AWS load balancer is called Elastic Load Balancing (ELB). It supports operation as either an application load balancer, a network load balancer, a gateway load balancer, or a classic load balancer that splits traffic across EC2 instances. AWS ELB can be accessed and configured via CLI, the AWS Management Console, the language-specific SDKs, and the Query API.

In addition to the classic types of load balancer, AWS ELB can use an add-on called the AWS Load Balancer Controller that is able to manage a series of load balancers for a Kubernetes cluster. It acts as a controller of multiple sub-load balancers in a Kubernetes cluster.

AWS ELB monitoring is performed in a few different ways. Through CloudWatch metrics (the easiest method), via access and CloudTrail logs, via tracking HTTP requests that the AWS load balancer is receiving, or by using a third-party AWS ELB monitoring tool. In this latter case the load balancer adds a header with a trace identifier to each request it receives, the X-Amzn-Trace-Id header, making it possible to follow requests from clients to targets or other services.

In AWS ELB monitoring the CloudWatch metrics will probably be the most commonly tracked ones. Every AWS load balancer publishes data points to Amazon CloudWatch. Depending on how you are using your ELBs, a range of metrics will be more important to you. Some of the key metrics are the active connection count, the number of HTTP redirects and fixed responses (like health checks), the total of IPv6 requests, rejected connections, and the number of HTTP's 3xx, 4xx, and 5xx error codes. These are common to all AWS load balancers. Then, depending on the resource types that the ELB is throttling traffic for, a few more metrics are required, specifically for Lambda targets for instance.

Another important metric, specifically in AWS ELB monitoring is the number of used LCUs (load balancer capacity unity). These are key for knowing how much you are using each Elastic load balancer, and to calculate their costs. Both CloudWatch and third-party AWS ELB monitoring tools, like Checkmk, can track this metric.

Azure load balancer monitoring

The Azure cloud service has its own set of load balancers. There are 4 in total, depending on the desired resources they need to throttle workloads for. Azure Front Door is an application load balancer that balances traffic for web applications. Traffic Manager is, in contrast, a network load balancer, a DNS load balancer to be precise, that can distribute traffic across global Azure regions. Application Gateway is a pure application load balancer. Lastly, the generically-named Azure Load Balancer provides throttling for UDP and TCP packets for any type of resource. These load balancers can also be combined, and are set up in the Azure portal.

Monitoring the load balancers on the Azure cloud is done either via Azure Monitor or using the APIs. The latter also allow third-party Azure load balancer monitoring tools to access the necessary metrics for their monitoring functions. The available metrics are in total not many, with the key metrics being packet count, byte count, total number of connections, backend health, and average load balancer health probe status. 

Complementing these metrics are the load balancer logs, available in the Activity section of Azure Monitor.

GCP load balancer monitoring

Moving on to GCP load balancers, the Google cloud service offers both network load balancers and application load balancers. The product itself is simply called Cloud Load Balancing and it can be set up to act as a load balancer for a network or on the application-level. Configuring the GCP load balancer is done through a CLI, the Google Cloud Console, the REST API or Terraform. The last two are particularly handy when using a third-party load balancer monitoring tool such as Checkmk.

In monitoring practice, checking the health and status of a load balancer on GCP is done through analyzing its logs and its retrieved metrics. Google Cloud Logging analyzes only logs,
while Google Cloud Monitoring displays and analyzes metrics.

However, Google Cloud Logging can be integrated into Google Cloud Monitoring, allowing both logs and metrics to be available in Google Cloud Monitoring, thus making it the default choice.
External load balancer monitoring tools can usually perform either or both of these tasks as well.

What specifically should be monitored are the metrics similarly-processed by the other main load balancers that we have seen so far. Inbound and outbound total connections count, latency, number of requests, and new, closed, and open connections are all good indicators of how well a GCP load balancer is operating. Along with the logs it is then possible to have a fairly good view of the health and efficiency of the load balancer.

Conclusion

We have seen how there are a handful of load balancers available on each of the three main cloud platforms. They are roughly speaking all either network, throttling traffic on a network layer, or application, throttling traffic for specific applications, load balancers. Both on AWS, Azure, and GCP the load balancers can be configured to act as network or application load balancers, as well as few more specific types.

Monitoring load balancers is a key aspect of having a cloud or hybrid infrastructure operating at maximum efficiency all of the time. In addition to the monitoring dashboards provided by each of the main cloud vendors, Checkmk can provide vital additional metrics and statistics covering all of the types of load balancers discussed here. Checkmk comes with a wealth of advanced features to aid cloud administrators in customizing their monitoring needs and it generates alerts when things go wrong.

FAQ

What is the difference between a reverse proxy and a load balancer?

A reverse proxy acts as the public face for other resources, usually a single or multiple servers. Contrarily to a load balancer, a reverse proxy does not throttle requests depending on the current workload of the servers behind it. A reverse proxy is more useful when security, caching, data compression, and privacy are desired, while a load balancer is key to ensuring efficiency, scalability, and performance.

What is the difference between an API gateway and a load balancer?

An API gateway acts as an intermediary between a client and a server. It is a single entry point for multiple services, taking charge of authorization, authentication, rate limiting, and logging for all of the servers behind it. In contrast to a load balancer, an API gateway is there to provide secure access to backend services, whereas the role of a load balancer is to distribute this access, and ensure optimal scalability and performance.